UCSF

Zbc

ZINC Biogenic compounds - commercially available primary and secondary metabolites (natural products), based on ZINC catalogs ending in np

Introduction

Zbc (ZINC Biogenic Compounds) is the database of all biogenic molecules you can buy as pure compounds. This includes secondary metabolites (commonly called natural products) and primary metabolites (commonly called simply metabolites).

 

Our information about what is natural and what is not is drawn from both vendor catalogs and public domain sources, including but not limited to wikipedia and google. One source of errors is that many source databases fail to completely specify chirality. Since ZINC is a 3D database, we are obliged to guess the correct stereochemistry. We take up to four guesses per ambiguous molecule. Thus for molecules with more than 2 stereocenters, caveat emptor.

 

If you find an error, please let us know and we will endeavor to fix it.

General Information

Created By
jji@cgl.ucsf.edu
Critieria
c.cat_id in (176,509,525,23,453, 493,501,503,504,505,517,520, 519,521,523,224,90,526, 507,460,685,453,608,609, 231,683,694,692,685,686,692,693,694,695)
Subset ID
98

ZINC Subset Overview

Last Updated
2013-02-22
Subset Size
180,313
Benign functionality only?
No filtering done

Quick Links

Browse
Sample molecules
Detailed view
Annotations view
Files
Properties
Purchasing
Unique Substances
Unix download
MOL2
SDF
Flexibase [Scripts to download database files to Linux/MacOS]
Windows download
MOL2
SDF
Flexibase [Scripts to download database files to Windows]

Chemical Diversity and Clustering

We assess the chemical diversity of a subset by clustering the molecules. First, we sort ligands by increasing molecular weight. Then, we use the SUBSET 1.0 algorithm ( Voigt JH, Bienfait B, Wang S, Nicklaus MC. JCICS, 2001, 41, 702-12) to progressively select compounds that differ from those previously selected by at least the Tanimoto cutoff, using ChemAxon default fingerprints. The resulting representatives have two interesting properties:

  • 1) Each representative differs from all the others by at least the Tanitmoto cutoff and
  • 2) All the molecules in the subset are within the Tanimoto cutoff of at least one representative.
Thus the representatives can be said to "cover" the chemical space of the subset at a given Tanimoto level. N/A indicates that clustering is pending.

Tanimoto Cutoff Level 60% 70% 80% 90% 100%
Number of Representatives 4,242 9,472 19,807 43,301 180,313

Physical Property Distributions

We compute the physical properties of each molecule in the subset, and graph them below.   Download Calculated Physical Properties
 

Tab-delimited information files

Ready-to-dock molecular files

More about this.
Format Reference(pH 7) Mid(pH 6-8) High(pH 8-9.5) Low(pH 4.5-6) Download Unix Download Windows
SMILES All All All All
MOL2 0 1 All All All Single Usual Metals All Single Usual Metals All
SDF 0 1 All All All Single Usual Metals All Single Usual Metals All
Flexibase 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 0 1 2 3 4 5 6 7 8 9 10 11 12 0 1 2 3 4 5 6 7 8 9 10 11 Single Usual Metals All Single Usual Metals All