BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of protein considered to be drug-targets with small\, drug-like molecules. As of March 2014, BindingDB contains 1,009,290 binding data, for 6,589 protein targets and 427,325 small molecules. The paper is: BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK. Nucleic Acids Res. 2007 Jan;35(Database issue):D198-201. Epub 2006 Dec 1. We are grateful to the authors for allowing us to incorporate the molecular structrues of this database in ZINC.
We assess the chemical diversity of a subset by clustering the molecules. First, we sort ligands by increasing molecular weight. Then, we use the SUBSET 1.0 algorithm ( Voigt JH, Bienfait B, Wang S, Nicklaus MC. JCICS, 2001, 41, 702-12) to progressively select compounds that differ from those previously selected by at least the Tanimoto cutoff, using ChemAxon default fingerprints. The resulting representatives have two interesting properties:
Tanimoto Cutoff Level | 60% | 70% | 80% | 90% | 100% |
---|---|---|---|---|---|
Number of Representatives | 14,109 | 42,445 | 94,066 | 187,526 | 566,998 |
We compute the physical properties of each molecule in the subset, and graph them below.
Download Calculated Physical Properties