UCSF

Zbc Drugs

Biogenic drug-like compounds; see wiki.

Introduction

This is a hand crafted subset. Thus the links to browse at the bottom do not YET work. There are two special files associated with this subset.

justification.txt - On each line, there are two ZINC IDs. The first one is the natural product. The second one is the molecule that is included in the subset. Thus the second one is tanimoto 80% or more to the first, or a 10+ atom fragment of the first. If you find a counter example, please let us know so that we can debug!

ZINC ID list - On each line, there are the substance ids that are included in this set.

We first created this subset on Jan 18, 2012. Since this is the first time we have done this, we may have made errors. If you find any problems with this subset, please let us know so that we can put them right. We intend to create a fresh version of this special subset at least twice a year.

General Information

Created By
jji at cgl.ucsf.edu
Critieria
p.mwt <= 500 and p.mwt >= 150 and p.xlogp <= 5 and p.rb <=7 and p.psa < 150 and p.n_h_donors <= 5 and p.n_h_acceptors <= 10
Subset ID
103

ZINC Subset Overview

Last Updated
2013-02-24
Subset Size
101,746
Benign functionality only?
No filtering done

Quick Links

Browse
Sample molecules
Detailed view
Annotations view
Files
Properties
Purchasing
Unique Substances
Unix download
MOL2
SDF
Flexibase [Scripts to download database files to Linux/MacOS]
Windows download
MOL2
SDF
Flexibase [Scripts to download database files to Windows]

Chemical Diversity and Clustering

We assess the chemical diversity of a subset by clustering the molecules. First, we sort ligands by increasing molecular weight. Then, we use the SUBSET 1.0 algorithm ( Voigt JH, Bienfait B, Wang S, Nicklaus MC. JCICS, 2001, 41, 702-12) to progressively select compounds that differ from those previously selected by at least the Tanimoto cutoff, using ChemAxon default fingerprints. The resulting representatives have two interesting properties:

  • 1) Each representative differs from all the others by at least the Tanitmoto cutoff and
  • 2) All the molecules in the subset are within the Tanimoto cutoff of at least one representative.
Thus the representatives can be said to "cover" the chemical space of the subset at a given Tanimoto level. N/A indicates that clustering is pending.

Tanimoto Cutoff Level 60% 70% 80% 90% 100%
Number of Representatives 3,183 6,682 13,068 27,367 101,746

Physical Property Distributions

We compute the physical properties of each molecule in the subset, and graph them below.   Download Calculated Physical Properties
 

Tab-delimited information files

Ready-to-dock molecular files

More about this.
Format Reference(pH 7) Mid(pH 6-8) High(pH 8-9.5) Low(pH 4.5-6) Download Unix Download Windows
SMILES All All All All
MOL2 All All All All Single Usual Metals All Single Usual Metals All
SDF All All All All Single Usual Metals All Single Usual Metals All
Flexibase Not Available Not Available Not Available Not Available