Introduction
PAINS is a hot topic recently. Some people estimates, that those compounds are 5-12% of all commercial libraries [1]. Here I present results of assesing percentage of PAINS in a various popular commercial libraries as well as in ZINC-all-now database.Libraries was filtered with SMARTS patterns prepared by Rajarshi Guha [2] and provided by filter-it software [3]. For comparison, I've included also ZINC database (which is a filtered and curated collection of ligands from commercial libraries), Chembl (v. 19; small molecules from scientific literature) and SureChemBl (structures from patents). There is also StructuralAlerts filter delivered by silicos-it and based on [4].
Results
How many PAINS are there?
To sum up: it's not so bad: maximal percentage of PAINS is <3% and 1.65% on average (1.82% for typical libraries).database | FilterFamily A | FilterFamily B | FilterFamily C | Total PAINS (A+B+C) | StructuralAlerts |
---|---|---|---|---|---|
chembl_19 | 1.56% | 0.62% | 0.21% | 2.39% | 47.03% |
Enamine Advanced | 0.30% | 0.10% | 0.08% | 0.48% | 21.17% |
Enamine HTS | 0.65% | 0.10% | 0.13% | 0.89% | 26.75% |
LifeChemicals stock | 1.70% | 0.15% | 0.11% | 1.95% | 25.11% |
Maybridge Screening | 1.56% | 0.76% | 0.62% | 2.94% | 48.62% |
SureChEMBL | 0.02% | 0.00% | 0.00% | 0.02% | 0.65% |
Zelinsky HTS | 1.66% | 0.86% | 0.32% | 2.83% | 47.36% |
ZINC All_now | 1.20% | 0.31% | 0.21% | 1.73% | 29.53% |
Average - %PAINS | 1.08% | 0.36% | 0.21% | 1.65% | 30.78% |
What are those PAINS?
Here are results of top 20 alerts (for all screened libraries) and a percentage of all alerts:
Here we have SMARTS of top PAINS pollutants (do you recognize your hits here?;):
rule | Count | Percent |
---|---|---|
azo_A(324) | 63934 | 15.65 |
ene_rhod_A(235) | 61727 | 15.11 |
anil_di_alk_D(198) | 44103 | 10.79 |
anil_di_alk_C(246) | 43936 | 10.75 |
imine_one_A(321) | 28330 | 6.93 |
ene_five_het_G(10) | 25333 | 6.20 |
anil_di_alk_B(251) | 21285 | 5.21 |
ene_five_het_B(90) | 14669 | 3.59 |
imine_one_isatin(189) | 13342 | 3.27 |
ene_five_hetA1(201A) | 13136 | 3.21 |
thio_ketone(43) | 8192 | 2.00 |
anil_alk_ene(51) | 7063 | 1.73 |
ene_one_hal(17) | 4949 | 1.21 |
thiophene_amino_Aa(45) | 4933 | 1.21 |
ene_five_het_C(85) | 4762 | 1.17 |
ene_one_ene_A(57) | 4447 | 1.09 |
imine_one_fives(89) | 4323 | 1.06 |
amino_acridine_A(46) | 4096 | 1.00 |
ene_five_het_D(46) | 3870 | 0.95 |
keto_keto_beta_A(68) | 3790 | 0.93 |
rhod_sat_A(33) | 2207 | 0.54 |
ene_cyano_A(19) | 2104 | 0.51 |
ene_five_one_A(55) | 1699 | 0.42 |
het_thio_66_one(8) | 1633 | 0.40 |
imidazole_A(19) | 1395 | 0.34 |
diazox_sulfon_A(36) | 1391 | 0.34 |
quinone_B(5) | 1266 | 0.31 |
keto_phenone_A(11) | 1262 | 0.31 |
acyl_het_A(9) | 1245 | 0.30 |
thiaz_ene_D(8) | 1219 | 0.30 |
keto_keto_gamma(5) | 1089 | 0.27 |
anil_di_alk_F(14) | 969 | 0.24 |
styrene_A(13) | 967 | 0.24 |
imine_imine_A(9) | 893 | 0.22 |
cyano_cyano_A(23) | 766 | 0.19 |
keto_keto_beta_B(12) | 644 | 0.16 |
het_6666_A(2) | 572 | 0.14 |
steroid_A(2) | 433 | 0.11 |
imine_one_sixes(27) | 344 | 0.08 |
ene_five_het_E(44) | 263 | 0.06 |
keto_phenone_B(1) | 241 | 0.06 |
het_65_C(6) | 216 | 0.05 |
styrene_B(8) | 186 | 0.05 |
het_5_A(7) | 133 | 0.03 |
imine_one_fives_B(9) | 128 | 0.03 |
het_thio_5_imine_A(1) | 114 | 0.03 |
ene_misc_A(5) | 108 | 0.03 |
het_pyridiniums_B(2) | 91 | 0.02 |
cyano_cyano_B(3) | 86 | 0.02 |
References
[1] http://cen.acs.org/articles/92/i35/Getting-Rid-Painful-Compounds.html, http://pipeline.corante.com/archives/2014/09/26/pains_go_mainstream.php
[2] http://blog.rguha.net/?p=850
[3] Unfortunatelly, no logner on the web
[4] Brenk et al. (2008) ChemMedChem 3, 435-444