Strony

Pokazywanie postów oznaczonych etykietą SMARTS. Pokaż wszystkie posty
Pokazywanie postów oznaczonych etykietą SMARTS. Pokaż wszystkie posty

wtorek, 21 października 2014

How many PAINS are there in commercial libraries?

Introduction

PAINS is a hot topic recently. Some people estimates, that those compounds are 5-12% of all commercial libraries [1]. Here I present results of assesing percentage of PAINS in a various popular commercial libraries as well as in ZINC-all-now database.

Libraries was filtered with SMARTS patterns prepared by Rajarshi Guha [2] and provided by filter-it software [3]. For comparison, I've included also ZINC database (which is a filtered and curated collection of ligands from commercial libraries), Chembl (v. 19; small molecules from scientific literature) and SureChemBl (structures from patents). There is also StructuralAlerts filter delivered by silicos-it and based on [4].


Results


How many PAINS are there?

To sum up: it's not so bad: maximal percentage of PAINS is <3% and 1.65% on average (1.82% for typical libraries).






databaseFilterFamily AFilterFamily BFilterFamily CTotal PAINS (A+B+C)StructuralAlerts
chembl_191.56%0.62%0.21%2.39%47.03%
Enamine Advanced0.30%0.10%0.08%0.48%21.17%
Enamine HTS0.65%0.10%0.13%0.89%26.75%
LifeChemicals stock1.70%0.15%0.11%1.95%25.11%
Maybridge Screening1.56%0.76%0.62%2.94%48.62%
SureChEMBL0.02%0.00%0.00%0.02%0.65%
Zelinsky HTS1.66%0.86%0.32%2.83%47.36%
ZINC All_now1.20%0.31%0.21%1.73%29.53%
Average - %PAINS1.08%0.36%0.21%1.65%30.78%



What are those PAINS?


Here are results of top 20 alerts (for all screened libraries) and a percentage of all alerts:


Here we have SMARTS of top PAINS pollutants (do you recognize your hits here?;):





ruleCountPercent
azo_A(324)6393415.65
ene_rhod_A(235)6172715.11
anil_di_alk_D(198)4410310.79
anil_di_alk_C(246)4393610.75
imine_one_A(321)283306.93
ene_five_het_G(10)253336.20
anil_di_alk_B(251)212855.21
ene_five_het_B(90)146693.59
imine_one_isatin(189)133423.27
ene_five_hetA1(201A)131363.21
thio_ketone(43)81922.00
anil_alk_ene(51)70631.73
ene_one_hal(17)49491.21
thiophene_amino_Aa(45)49331.21
ene_five_het_C(85)47621.17
ene_one_ene_A(57)44471.09
imine_one_fives(89)43231.06
amino_acridine_A(46)40961.00
ene_five_het_D(46)38700.95
keto_keto_beta_A(68)37900.93
rhod_sat_A(33)22070.54
ene_cyano_A(19)21040.51
ene_five_one_A(55)16990.42
het_thio_66_one(8)16330.40
imidazole_A(19)13950.34
diazox_sulfon_A(36)13910.34
quinone_B(5)12660.31
keto_phenone_A(11)12620.31
acyl_het_A(9)12450.30
thiaz_ene_D(8)12190.30
keto_keto_gamma(5)10890.27
anil_di_alk_F(14)9690.24
styrene_A(13)9670.24
imine_imine_A(9)8930.22
cyano_cyano_A(23)7660.19
keto_keto_beta_B(12)6440.16
het_6666_A(2)5720.14
steroid_A(2)4330.11
imine_one_sixes(27)3440.08
ene_five_het_E(44)2630.06
keto_phenone_B(1)2410.06
het_65_C(6)2160.05
styrene_B(8)1860.05
het_5_A(7)1330.03
imine_one_fives_B(9)1280.03
het_thio_5_imine_A(1)1140.03
ene_misc_A(5)1080.03
het_pyridiniums_B(2)910.02
cyano_cyano_B(3)860.02



References


[1] http://cen.acs.org/articles/92/i35/Getting-Rid-Painful-Compounds.html, http://pipeline.corante.com/archives/2014/09/26/pains_go_mainstream.php
[2] http://blog.rguha.net/?p=850
[3] Unfortunatelly, no logner on the web
[4] Brenk et al. (2008) ChemMedChem 3, 435-444