“Space, the final frontier...”
Como explorar o espaço de funções escores para o desenho de fármacos.
Prof. Dr. Walter F. de Azevedo Jr.azevedolab.net
Doutor em Ciências – Física Aplicada – Universidade de São Paulo –USP
Pesquisador Visitante na Universidade da Califórnia em Berkeley-EUA 1993-1996
Participante da pesquisa de cristalização no espaço com a Nasa (STS-95)-1998
Livre-Docente em Física – Universidade Estatual Paulista – UNESP
Editor Regional da Revista Current Drug Targets
Editor de Seção (Bioinformatics in Drug Design and Discovery) da Revista Current Medicinal
Chemistry
Membro do Corpo Editorial da Revista Current Bioinformatics
Pesquisador nível 1B do CNPq
azevedolab.net
azevedolab.net
~1060 moléculas orgânicas
Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–
50. PubMed
azevedolab.net
Vantagens...
Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–
50. PubMed
azevedolab.net
Abstração de sistemas...
Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–
50. PubMed
azevedolab.net
Definição de subespaços de interesse
Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–
50. PubMed
azevedolab.net
Subespaço de fármacos
Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–
50. PubMed
azevedolab.net
Subespaço de produtos naturais
Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–
50. PubMed
azevedolab.net
Subespaço de inibidores de CDK
Bohacek RS, McMartin C, Guida WC. The art and practice of structure-based drug design: a molecular modeling perspective. Med Res Rev. 1996; 16(1):3–
50. PubMed
azevedolab.net
O que é o espaço de proteínas?
Hou J, Jun SR, Zhang C, Kim SH. Global mapping of the protein structure space and application in structure-based inference of protein function.
Proc Natl Acad Sci U S A. 2005; 102(10):3651-6.
Representation of protein space. Fonte: http://www.pnas.org/content/102/10/3651/tab-figures-data
azevedolab.net
O que é o espaço de proteínas?
Representation of protein space. Fonte: http://www.pnas.org/content/102/10/3651/tab-figures-data
Hou J, Jun SR, Zhang C, Kim SH. Global mapping of the protein structure space and application in structure-based inference of protein function.
Proc Natl Acad Sci U S A. 2005; 102(10):3651-6.
azevedolab.net
Espaço de proteínas
Smith JM. Natural selection and the concept of a protein space. Nature. 1970; 225(5232): 563–564. PubMed
azevedolab.net
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
O que é o espaço de funções
escores?
azevedolab.net
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
Como pesquisar o espaço de
funções escores?
azevedolab.net
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
azevedolab.net
Espaço de proteínas
Espaço químico
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
azevedolab.net
Espaço de proteínas
Espaço químico
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
azevedolab.net
Espaço de proteínas
Espaço químico
Ki, IC50, Kd ou G
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
azevedolab.net
Espaço de proteínas
Espaço químico
Ki, IC50, Kd ou G
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
azevedolab.net
Espaço de proteínas
Espaço químico
Ki, IC50, Kd ou G
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
azevedolab.net
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑗,𝑘 𝑧𝑗,𝑘 +
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑗,𝑘 𝑧𝑗,𝑘 + 𝐴.𝐸𝑥𝑝(𝑤)
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 + 𝐴
𝑗=0
𝑁
𝛽𝑗𝑦𝑗 + 𝐵
𝑘=0
𝑁
𝛾𝑘𝑧𝑘 + 𝐶
𝑘=0
𝑁
𝜔𝑘𝑧𝑘2 + 𝐷
𝑘=0
𝑁
𝜔𝑘𝑧𝑘3 + 𝐸
𝑘=0
𝑁
𝜔𝑘𝑧𝑘4
log(𝐾𝑖) =
𝑖=0
𝑁
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑖,𝑗,𝑘 𝑧𝑖,𝑗,𝑘 + 𝐵. 𝑆𝑖𝑛(𝑤)
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log 𝐾𝑖 =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 + 𝐴.𝐸𝑥𝑝 𝜔
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2 …
Espaço de proteínas
Espaço de funções escores
Espaço químico
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
azevedolab.net
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑗,𝑘 𝑧𝑗,𝑘 +
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑗,𝑘 𝑧𝑗,𝑘 + 𝐴.𝐸𝑥𝑝(𝑤)
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 + 𝐴
𝑗=0
𝑁
𝛽𝑗𝑦𝑗 + 𝐵
𝑘=0
𝑁
𝛾𝑘𝑧𝑘 + 𝐶
𝑘=0
𝑁
𝜔𝑘𝑧𝑘2 + 𝐷
𝑘=0
𝑁
𝜔𝑘𝑧𝑘3 + 𝐸
𝑘=0
𝑁
𝜔𝑘𝑧𝑘4
log(𝐾𝑖) =
𝑖=0
𝑁
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑖,𝑗,𝑘 𝑧𝑖,𝑗,𝑘 + 𝐵. 𝑆𝑖𝑛(𝑤)
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log 𝐾𝑖 =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 + 𝐴.𝐸𝑥𝑝 𝜔
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2 …
Espaço de proteínas
Espaço de funções escores
Espaço químico
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
azevedolab.net
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑗,𝑘 𝑧𝑗,𝑘 +
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑗,𝑘 𝑧𝑗,𝑘 + 𝐴.𝐸𝑥𝑝(𝑤)
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 + 𝐴
𝑗=0
𝑁
𝛽𝑗𝑦𝑗 + 𝐵
𝑘=0
𝑁
𝛾𝑘𝑧𝑘 + 𝐶
𝑘=0
𝑁
𝜔𝑘𝑧𝑘2 + 𝐷
𝑘=0
𝑁
𝜔𝑘𝑧𝑘3 + 𝐸
𝑘=0
𝑁
𝜔𝑘𝑧𝑘4
log(𝐾𝑖) =
𝑖=0
𝑁
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑖,𝑗,𝑘 𝑧𝑖,𝑗,𝑘 + 𝐵. 𝑆𝑖𝑛(𝑤)
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log 𝐾𝑖 =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 + 𝐴.𝐸𝑥𝑝 𝜔
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2 …
Espaço de proteínas
Espaço de funções escores
Espaço químico
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
azevedolab.net
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑗,𝑘 𝑧𝑗,𝑘 +
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑗,𝑘 𝑧𝑗,𝑘 + 𝐴.𝐸𝑥𝑝(𝑤)
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 + 𝐴
𝑗=0
𝑁
𝛽𝑗𝑦𝑗 + 𝐵
𝑘=0
𝑁
𝛾𝑘𝑧𝑘 + 𝐶
𝑘=0
𝑁
𝜔𝑘𝑧𝑘2 + 𝐷
𝑘=0
𝑁
𝜔𝑘𝑧𝑘3 + 𝐸
𝑘=0
𝑁
𝜔𝑘𝑧𝑘4
log(𝐾𝑖) =
𝑖=0
𝑁
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑖,𝑗,𝑘 𝑧𝑖,𝑗,𝑘 + 𝐵. 𝑆𝑖𝑛(𝑤)
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log 𝐾𝑖 =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 + 𝐴.𝐸𝑥𝑝 𝜔
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2 …
Espaço de proteínas
Espaço de funções escores
Espaço químico
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
azevedolab.net
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑗,𝑘 𝑧𝑗,𝑘 +
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 +
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑗,𝑘 𝑧𝑗,𝑘 + 𝐴.𝐸𝑥𝑝(𝑤)
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log(𝐾𝑖) =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 + 𝐴
𝑗=0
𝑁
𝛽𝑗𝑦𝑗 + 𝐵
𝑘=0
𝑁
𝛾𝑘𝑧𝑘 + 𝐶
𝑘=0
𝑁
𝜔𝑘𝑧𝑘2 + 𝐷
𝑘=0
𝑁
𝜔𝑘𝑧𝑘3 + 𝐸
𝑘=0
𝑁
𝜔𝑘𝑧𝑘4
log(𝐾𝑖) =
𝑖=0
𝑁
𝑗=0
𝑀
𝑘=0
𝑀
𝛾𝑖,𝑗,𝑘 𝑧𝑖,𝑗,𝑘 + 𝐵. 𝑆𝑖𝑛(𝑤)
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2
log 𝐾𝑖 =
𝑖=0
𝑁
𝛼𝑖𝑥𝑖 + 𝐴.𝐸𝑥𝑝 𝜔
𝑗=0
𝑀
𝛽𝑗𝑦𝑖2 …
Espaço de proteínas
Espaço de funções escores
Espaço químico
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
azevedolab.net
𝑉 𝑥 = 𝑉 𝑑0 + 𝑉 𝑑0′ 𝑥 − 𝑑0 + ( Τ1 2!)𝑉 𝑑0
′′ 𝑥 − 𝑑02 + ( Τ1 3!)𝑉 𝑑0
′′′ 𝑥 − 𝑑03 +⋯ (1)
𝑉 𝑥 ≈ 𝑉 𝑑0 + ( Τ1 2!)𝑉 𝑑0′′ 𝑥 − 𝑑0
2 (2)
d0
d0 = equilíbrio
Sistema massa-mola em movimento harmônico simples
Sistema massa-mola para simulação das interações
proteína-ligante (física básica)
azevedolab.net
𝑉 𝑥 ≈ 𝑉 𝑑0 + ( Τ1 2!)𝑉 𝑑0′′ 𝑥 − 𝑑0
2 (2)
Sistema massa-mola para simulação das interações
proteína-ligante (física básica)
azevedolab.net
Materiais homogêneos isotrópicos (Kot et al., 2015; Kot & Nagahashi, 2017)
Folha de grafeno (Kim et al., 2014)
Tunelamento de elétrons em transistors(Pasupathy et al., 2005),
Kim,M.H. et al. (2014) Vibrational characteristics of graphene sheets elucidated using an elastic network model. Phys. Chem. Chem. Phys., 16, 15263–15271.
Kot,M. et al. (2015) Elastic moduli of simple mass spring models. Vis. Comput., 31(10), 1339–1350.
Kot,M. and Nagahashi,H. (2017) Mass spring models with adjustable Poisson’s ratio. Vis. Comput., 33(3), 283–291.
Pasupathy,A.N. et al. (2005) Vibration-assisted electron tunneling in C140 transistors. Nano Lett., 5, 203-207.
Sistema massa-mola para simulação (exemplos)
azevedolab.net Sistema massa-mola para simulação das interações
proteína-ligante (distância de equilíbrio)
azevedolab.net
𝑃𝐵𝐴 = 𝛼0 + σ𝑖σ𝑗 𝛼𝑖,𝑗(𝑑𝑖,𝑗 − 𝑑0,𝑖,𝑗)2 (3)
Sistema massa-mola para simulação das interações
proteína-ligante (afinidade de ligação)
azevedolab.net
𝑃𝐵𝐴 = 𝛼0 + σ𝑖σ𝑗 𝛼𝑖,𝑗(𝑑𝑖,𝑗 − 𝑑0,𝑖,𝑗)2 (3)
𝑅𝑆𝑆 = σ𝑖=1𝑀 (𝑦𝑖 − 𝑃𝐵𝐴𝑖)
2 + 𝜆1σ𝑗=1𝑁 𝜔𝑗 + 𝜆2σ𝑗=1
𝑁 𝜔𝑗2
(4)
[1] Legendre, AM. Nouvelle méthodes pour la déterminiation des orbites des comètes, Courcier, Paris, 1805.
[2] Tibshirani, R. Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Series B Stat. Methodol. 1996, 58, 267–288. https://doi.org/10.1111/j.1467-9868.2011.00771.x.
[3] Tikhonov, AN. On the regularization of ill-posed problems, Dokl. Akad. Nauk SSSR. 1963, 153, 49–52.
[4] H. Zou, T. Hastie, Regularization and variable selection via the elastic net, J. R. Stat. Soc. Series B Stat. Methodol. 2005, 67, 301–220. https://doi.org/10.1111/j.1467-
9868.2005.00503.x.
Método 1 2 Referências
Regressão Linear Ordinária 0 0 [1]
Least absolute shrinkage and selection operator (Lasso) >0 0 [2]
Ridge 0 > 0 [3]
Elastic Net > 0 > 0 [4]
Sistema massa-mola para simulação das interações
proteína-ligante (aprendizado de máquina)
azevedolab.net Sistema massa-mola para simulação das interações
proteína-ligante (sistema biológico)
azevedolab.net
M
G2
S
G1
CDK2/
Cyclin E
CDK1/
Cyclin A
CDK2/
Cyclin A
CDK1/
Cyclin B
CDK4/
Cyclin D CDK6/
Cyclin D
Sistema massa-mola para simulação das interações
proteína-ligante (sistema biológico)
azevedolab.net
Heck GS, Pintro VO, Pereira RR, de Ávila MB, Levin NMB, de Azevedo WF. Supervised Machine Learning Methods Applied to Predict Ligand-Binding Affinity. Curr Med Chem.
2017; 24(23): 2459–2470. PubMed PDF
Espaço de funções escores
Espaço de proteínasEspaço químico
azevedolab.net
PDB access code Ligand Identification
Ligand Chain
Ligand Number
Ki (nM) Test Set
1E1V CMG A 401 12000 11E1X NW1 A 401 1300 01H1S 4SP A 1298 6 01JSV U55 A 400 2000 11OGU ST8 A 1298 2400 01PXM CK5 A 500 60 11PXN CK6 A 500 195 01PXO CK7 A 500 2 11PXP CK8 A 500 220 01PYE PM1 A 700 386 11V1K 3FP A 299 35000 12CLX F18 A 1299 13300 02EXM ZIP A 400 78000 02FVD LIA A 299 3 02XMY CDK A 500 0.11 12XNB Y8L A 1299 149 13BLR CPB A 940 3 03DDQ RRC A 299 250 03LFN A27 A 299 3160 03LFS A07 A 299 2500 13MY5 RFZ A 300 65000 04ACM 7YG A 1302 210 04BCK T3E A 1298 4 04BCM T7Z A 1297 123 04BCN T9N A 1299 12 04BCO T6Q A 1299 131 04BCP T3C A 1299 568 04BCQ TJF A 1296 147 04EOP 1RO A 301 890 04NJ3 2KD A 301 140 05D1J 56H A 4000 38 0
Sistema massa-mola para simulação das interações
proteína-ligante (conjunto de dados)
azevedolab.net
Regression are the following:
α0 = -6.581356;
αC,N = -0.111232;
αC,O = -0.406456;
αN,F = -0.353717.
The equilibrium distances are the following:
d0,C,N =3.99463;
d0,C,O =3.88190;
d0,N,F = 4.21672 Å.
Taba obtained these results using the elastic net with
cross-validation (CV) as a regression method.
𝑃𝐵𝐴 = 𝛼0 + σ𝑖σ𝑗 𝛼𝑖,𝑗(𝑑𝑖,𝑗 − 𝑑0,𝑖,𝑗)2
Sistema massa-mola para simulação das interações
proteína-ligante (modelo)
azevedolab.net
Scoring Functions p-value1 R p-value2Free Energya -0.133 0.7324 0.204 0.2227Final Intermolecular Energya 0.133 0.7324 0.204 0.2228vdW+Hbond+desolv Energya 0.133 0.7324 0.204 0.2228Electrostatic Energya 0.533 0.1392 0.376 0.0789Final Total Internal Energya -0.133 0.7324 0.089 0.4365Torsional Free Energya 0.068 0.8630 0.000 0.9792Plants Scoreb 0.183 0.6368 0.001 0.9348MolDock Scoreb 0.217 0.5755 0.010 0.7950Rerank Scoreb 0.333 0.3807 0.007 0.8336Interaction Scoreb 0.367 0.3317 0.013 0.7698Protein Scoreb 0.367 0.3317 0.025 0.6839Water Scoreb -0.569 0.1098 0.395 0.0699Internal Scoreb 0.033 0.9322 0.001 0.9369Electrostatic Scoreb 0.548 0.1269 0.204 0.2218Electrostatic Long Scoreb -0.548 0.1269 0.204 0.2218H-Bond Scoreb 0.650 0.0581 0.512 0.0301Ligand Efficiency 1 Scoreb 0.150 0.7001 0.024 0.6935Ligand Efficiency 3 Scoreb 0.283 0.4600 0.023 0.6968Affinity Scorec -0.067 0.8647 0.117 0.3669Gauss1 Scorec -0.367 0.3317 0.120 0.3603Gauss2 Scorec -0.283 0.4600 0.018 0.7297Repulsion Scorec -0.700 0.0358 0.240 0.1804Hydrophobic Scorec 0.100 0.7980 0.002 0.9157Hydrogen Scorec -0.583 0.0992 0.340 0.0993Taba (3 variables, d 4.5 Å) 0.783 0.01252 0.794 0.0107
Predictive performance of scoring functions (test set).aAutoDock 4, bMolegro Virtual Docker (MVD), cAutoDock Vina.
p-value1 and p-value2 are related to ρ and R, respectively.
Sistema massa-mola para simulação das interações
proteína-ligante (análise estatística)
azevedolab.net
𝑃𝐵𝐴 = 𝛼0 + σ𝑖σ𝑗 𝛼𝑖,𝑗(𝑑𝑖,𝑗 − 𝑑0,𝑖,𝑗)2
Hernandes MZ, Cavalcanti SM, Moreira DR, de Azevedo Junior WF, Leite AC. Halogen atoms in the modern medicinal chemistry: hints for the drug
design. Curr Drug Targets. 2010; 11(3):303–314. PubMed
Regression are the following:
α0 = -6.581356;
αC,N = -0.111232;
αC,O = -0.406456;
αN,F = -0.353717.
The equilibrium distances are the following:
d0,C,N =3.99463;
d0,C,O =3.88190;
d0,N,F = 4.21672 Å.
Sistema massa-mola para simulação das interações
proteína-ligante (modelo)
azevedolab.net
SAnDReS
Statistical Analysis of Docking Results and Scoring Functions
sandres.net
Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and
Development of Scoring Functions. Comb Chem High Throughput Screen. 2016; 19(10): 801–812. Link PubMed Go To SAnDReS PDF GitHub
azevedolab.net
𝑃𝐵𝐴 = 𝛼0 + 𝛼1𝑥1 + 𝛼2𝑥2 + 𝛼3𝑥3 +𝛼4𝑥1𝑥2 + 𝛼5𝑥1𝑥3 + 𝛼6𝑥2𝑥3+
𝛼7𝑥12 + 𝛼8𝑥2
2 + 𝛼9𝑥32
Onde x1, x2 e x3 são termos de energias tiradas de programas como o AutoDock 4, AutoDock Vinae Molegro Virtual Docker
Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and
Development of Scoring Functions. Comb Chem High Throughput Screen. 2016; 19(10): 801–812. Link PubMed Go To SAnDReS PDF GitHub
Função escore polinomial
azevedolab.netScoring Functions and Energy Terms DescriptionMolDock Score Protein-ligand Scoring Function. This scoring function is the sum of internal ligand
energies, protein interaction energy and soft penalties
PLANTS Score Protein ligand Scoring FunctionRe-rank Score Protein ligand Scoring FunctionEnergy Term 1 Interaction energy between the ligand and the target molecule(s) (Interaction)
Energy Term 2 Interaction energy between the ligand and the co-factor (Cofactor)
Energy Term 3 Interaction energy between the ligand and the protein (Protein)
Energy Term 4 Interaction energy between the ligand and the water molecules (Water)
Energy Term 5 Internal energy of the ligand (Internal)
Energy Term 6 Short-range electrostatic protein-ligand interactions (r<4.5Å) (Electro)
Energy Term 7 Long-range electrostatic protein-ligand interactions (r>4.5A) (ElectroLong)
Energy Term 8 Hydrogen bonding energy (HBond)LE1 Score Ligand Efficiency 1: MolDock Score divided by Heavy Atoms count
LE3 Score Ligand Efficiency 3: Rerank Score divided by Heavy Atoms count
Docking Score Score evaluated before post-processing (either PLANTS or MolDock). Only used for re-
docking.Displaced Water Score Energy contributions from non-displaced and displaced water interactions.
AutoDock4 Scoring Function This scoring function makes use of five energetic terms: the torsional term, the
hydrogen bonding interactions, the electrostatic potential, the desolvation energy, and
the van der Waals interactions
AutoDock Vina Scoring Function Vina makes use of the following energy terms: Gauss1, Gauss2, repulsion, hydrophobic,
hydrogen bond, and torsion. They are defined elsewhere
List of all scoring functions used in this study.
Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and
Development of Scoring Functions. Comb Chem High Throughput Screen. 2016; 19(10): 801–812. Link PubMed Go To SAnDReS PDF GitHub
azevedolab.net
Xavier MM, Heck GS, de Avila MB, Levin NM, Pintro VO, Carvalho NL, Azevedo WF Jr. SAnDReS a Computational Tool for Statistical Analysis of Docking Results and
Development of Scoring Functions. Comb Chem High Throughput Screen. 2016; 19(10): 801–812. Link PubMed Go To SAnDReS PDF GitHub
azevedolab.net
Datasets PDB Access Codes
HRIC50
We used a dataset composed of an ensemble of high-resolutioncrystallographic structures solved to resolution better than 1.5 Å,and for which there is experimental data for half-maximal inhibitory concentration (IC50) for the active ligands
2GG3,2GG7,2GG9,2HU6,2I5F,2IKG,2NMZ,2NNG,2OW6,2PDG,2PIY,2PZN,2QCF,2R3I,2W14,2W3B,2W9H,2WUU
,2WZX,2X5O,2XPC,2XU3,2XU4,2Y1O,2Y68,2YC3,2YEX,2YJ2,2YJ8,2YJ9,2YJC,2YK9,2YKE,2YKJ,3B28,3B7E,3B8Z,
3BCJ,3BLB,3CBP,3DCR,3DD0,3DN5,3EJS,3EJT,3EJU,3ESS,3EWZ,3EX3,3F66,3FCI,3FS6,3GHV,3GHW,3H5B,3HHA,
3HJ0,3HNB,3HS4,3HYG,3I06,3I33,3I6C,3I6O,3IOG,3IU7,3KFA,3KIG,3KKU,3KL6,3KWZ,3L14,3M0I,3M4H,3NKK,
3NTZ,3NU0,3NU3,3NWB,3NXO,3NXX,3NZB,3OND,3OT3,3OVX,3OZS,3OZT,3PA3,3PKA,3PKB,3PX8,3R6T,3RL4,
3S1Y,3S71,3SPK,3TEM,3U2C,3UHM,3VF3,3VHV,3VW9,3WFG,3ZSJ,3ZXH,4A6V,4A6W,4BW1,4DHR,4DRI,4DRN,
4DRO,4DRQ,4E4A,4F3I,4FH2,4FLK,4FYO,4GCJ,4GQR,4GV1,4HCT,4HCU,4HCV,4HWW,4HXQ,4HXS,4HY4,4HYI,
4IGH,4IKU,4JHT,4KEB,4L7G,4M5R
CDK2IC50 1GII, 1OIR, 2B53, 2B54, 2R3H, 3IGG, 3LE6, 3PXZ, 3PY0, 3RZB, 4RJ3
List of PDB access codes used for both datasets.
de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity.
A study for cyclin-dependent kinase 2. Biochem Biophys Res Commun. 2017; 494: 305–310. PubMed PDF
azevedolab.net
)0.000040(x .z)0.001090(y-.z)0.000185(x+.y)0.000069(x+5.763674 2=)log( 50IC
where x is Interaction energy between the pose and the protein, y is
Internal energy of the ligand (Internal), and z is Hydrogen bonding
energy (HBond).
Scoring Functions
and Energy Terms
(training
set)
p-value
(training
set)
(test set)p-value (test
set)
PLANTS Score 0.266 3.797.10-03 0.167 2.185.10-01
MolDock Score 0.284 1.939.10-03 0.224 9.678.10-02
Rerank Score 0.227 1.371.10-02 0.109 4.219.10-01
Term 1 0.334 2.305.10-04 0.215 1.109.10-01
Term 2 0.130 1.623.10-01 0.211 1.192.10-01
Term 3 0.340 1.795.10-04 0.147 2.810.10-01
Term 4 0.214 2.032.10-02 0.083 5.455.10-01
Term 5 -0.077 4.104.10-01 0.155 2.541.10-01
Term 6 -0.107 2.514.10-01 -0.179 1.871.10-01
Term 7 0.134 1.511.10-01 0.101 4.568.10-01
Term 8 -0.067 4.746.10-01 -0.237 7.889.10-02
Polscore0000060 0.401 7.243.10-06 0.328 1.363.10-02
Results for training and test sets for HRIC50 dataset.
de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity.
A study for cyclin-dependent kinase 2. Biochem Biophys Res Commun. 2017; 494: 305–310. PubMed PDF
azevedolab.net
Scoring Functions
and Energy Terms
(training
set)
p-value
(training
set)
(test set)p-value (test
set)
PLANTS Score 0.266 3.797.10-03 0.167 2.185.10-01
MolDock Score 0.284 1.939.10-03 0.224 9.678.10-02
Rerank Score 0.227 1.371.10-02 0.109 4.219.10-01
Term 1 0.334 2.305.10-04 0.215 1.109.10-01
Term 2 0.130 1.623.10-01 0.211 1.192.10-01
Term 3 0.340 1.795.10-04 0.147 2.810.10-01
Term 4 0.214 2.032.10-02 0.083 5.455.10-01
Term 5 -0.077 4.104.10-01 0.155 2.541.10-01
Term 6 -0.107 2.514.10-01 -0.179 1.871.10-01
Term 7 0.134 1.511.10-01 0.101 4.568.10-01
Term 8 -0.067 4.746.10-01 -0.237 7.889.10-02
Polscore0000060 0.401 7.243.10-06 0.328 1.363.10-02
Results for training and test sets for HRIC50 dataset.
de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity.
A study for cyclin-dependent kinase 2. Biochem Biophys Res Commun. 2017; 494: 305–310. PubMed PDF
azevedolab.net
PDB Access
Code
Active
Ligand
Code
Resolution (Å) IC50(nM) log(IC50) Predicted
log(IC50)
1GII 1PU 2.00 260 -6.585 -6.4631OIR HDY 1.91 32 -7.495 -7.4952B53 D23 2.00 600 -6.222 -6.2212B54 D05 1.85 20 -7.699 -7.6992R3H SCE 1.50 20000 -4.699 -3.8393IGG EFQ 1.80 80.75 -7.093 -6.1963LE6 2BZ 2.00 35 -7.456 -6.2773PXZ JWS 1.70 5900 -5.229 -5.2773PY0 SU9 1.75 79.25 -7.101 -6.6783RZB 02Z 1.90 100000 -4.000 -5.7794RJ3 3QS 1.63 93 -7.032 -6.544
Experimental and predicted log(IC50) for all structures in the CDK2IC50 dataset.
de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity.
A study for cyclin-dependent kinase 2. Biochem Biophys Res Commun. 2017; 494: 305–310. PubMed PDF
azevedolab.net
Scoring Functions and
Energy Termsa
ρ p-value R2 p-value
Affinityb 0.418 2.006.10-01 0.237 1.289.10-01
Gauss1b -0.773 5.299.10-03 0.393 3.889.10-02
Gauss2b -0.645 3.196.10-02 0.386 4.125.10-02
Repulsionb -0.618 4.265.10-02 0.276 9.715.10-02
Hydrophobicb -0.391 2.345.10-01 0.223 1.424.10-01
Hydrogenb -0.730 1.069.10-02 0.280 9.386.10-02
Free Energyc 0.445 1.697.10-01 0.082 3.923.10-01
Final Intermolecular Energyc 0.400 2.229.10-01 0.082 3.923.10-01
vdW+Hbond+desolv Energyc 0.409 2.115.10-01 0.082 3.923.10-01
Electrostatic Energyc -0.209 5.372.10-01 0.082 3.922.10-01
Final Total Internal Energyc 0.588 5.725.10-02 0.345 5.730.10-02
Torsional Free Energyc -0.304 3.637.10-01 0.106 3.298.10-01
MolDock Scored 0.391 2.345.10-01 0.173 2.028.10-01
PLANTS Scored 0.682 2.084.10-02 0.507 1.401.10-02
Rerank Scored -0.591 5.558.10-02 0.768 4.044.10-04
Ligand Efficiency 1d -0.391 2.345.10-01 0.183 1.888.10-01
Ligand Efficiency 3d -0.345 2.981.10-01 0.294 8.516.10-02
Polscore 60e 0.845 1.045.10-03 0.608 4.650.10-03
Statistical analysis of predictive power for all structures in the CDK2IC50 dataset.
de Ávila MB, Xavier MM, Pintro VO, de Azevedo WF. Supervised machine learning techniques to predict binding affinity.
A study for cyclin-dependent kinase 2. Biochem Biophys Res Commun. 2017; 494: 305–310. PubMed PDF
aEnergy term and scoring function values were calculated using the
crystallographic position for the ligands.bScoring function and energy terms calculated using AutoDock Vina.cScoring function and energy terms calculated using AutoDock 4. dScoring function and energy terms calculated using MVD.eMachine learning model generated using SAnDReS.
azevedolab.net
Conclusões
Taba e SAnDReS são capazes de gerar modelos para previsão de
afinidade superiores a funções clássicas (Molegro Virtual Docker,
AutoDock 4, AutoDock Vina
O conceito do espaço de funções escores é uma forma elegante de
desenvolvermos modelos direcionados para sistemas biológicos de
interesse
azevedolab.net
Trabalhos Futuros
SAnDReS 2.0
Liberação da versão estável do Taba
Aplicação de ambas abordagens a sistemas biológicos de interesse
Top Related