ChEMBL is the medicinal chemistry database available from https://www.ebi.ac.uk/chembl. The paper is: The ChEMBL bioactivity database: an update. Bento AP, Gaulton A, Hersey A, Bellis LJ, Chambers J, Davies M, Krüger FA, Light Y, Mak L, McGlinchey S, Nowotka M, Papadatos G, Santos R, Overington JP. Nucleic Acids Res. 2014 Jan;42(Database issue):D1083-90. doi: 10.1093/nar/gkt1031. Epub 2013 Nov 7. We are grateful to the authors for creating and maintaining this resource and for allowing us to incorporate its structures into ZINC. ChEMBL is made available under the CC-BY-CA license, and correspondingly, all derived versions of ChEMBL via ZINC are freely available. CC-BY-SA license
We assess the chemical diversity of a subset by clustering the molecules. First, we sort ligands by increasing molecular weight. Then, we use the SUBSET 1.0 algorithm ( Voigt JH, Bienfait B, Wang S, Nicklaus MC. JCICS, 2001, 41, 702-12) to progressively select compounds that differ from those previously selected by at least the Tanimoto cutoff, using ChemAxon default fingerprints. The resulting representatives have two interesting properties:
Tanimoto Cutoff Level | 60% | 70% | 80% | 90% | 100% |
---|---|---|---|---|---|
Number of Representatives | 20,903 | 0 | 80,659 | 0 | 1,609,835 |
We compute the physical properties of each molecule in the subset, and graph them below.
Download Calculated Physical Properties
Format | Reference(pH 7) | Mid(pH 6-8) | High(pH 8-9.5) | Low(pH 4.5-6) | Download Unix |
Download Windows |
---|---|---|---|---|---|---|
SMILES | All | All | All | All | ||
MOL2 | 0 1 2 3 4 5 6 7 8 9 10 11 12 13 | 0 1 2 | 0 1 2 | 0 1 | Single Usual Metals All | Single Usual Metals All |
SDF | 0 1 2 3 4 5 6 7 8 9 10 11 12 13 | 0 1 2 | 0 1 2 | 0 1 | Single Usual Metals All | Single Usual Metals All |
Flexibase | Not Available | Not Available | Not Available | Not Available |