|
| 1 | +--- |
| 2 | +name: ld50_catmos |
| 3 | +description: |- |
| 4 | + Acute toxicity LD50 measures |
| 5 | + the most conservative dose that can lead to lethal adverse effects. |
| 6 | + The higher the dose, the more lethal of a drug. |
| 7 | + We aggregated the data from multiple SMILES by computing the mean. |
| 8 | +targets: |
| 9 | + - id: CATMoS_LD50_mgkg |
| 10 | + description: Acute Toxicity LD50. |
| 11 | + units: mg/kg |
| 12 | + type: continuous |
| 13 | + names: |
| 14 | + - noun: acute oral toxicity rat LD50 |
| 15 | + - noun: acute oral toxicity (LD50 in rats) |
| 16 | + uris: |
| 17 | + - http://www.bioassayontology.org/bao#BAO_0002117 |
| 18 | + significant_digits: 1 |
| 19 | + - id: log10_LD50 |
| 20 | + description: Acute Toxicity LD50. |
| 21 | + units: log10(mg/kg) |
| 22 | + type: continuous |
| 23 | + names: |
| 24 | + - noun: log10 acute oral toxicity rat LD50 |
| 25 | + - noun: log10 acute oral toxicity (LD50 in rats) |
| 26 | + - noun: log10 LD50 in rats (oral exposure) |
| 27 | + - noun: log10 rat LD50 (oral exposure) |
| 28 | + significant_digits: 2 |
| 29 | + - id: num_ghose_violations |
| 30 | + description: Ghose filter violations |
| 31 | + type: ordinal |
| 32 | + significant_digits: 0 |
| 33 | + names: |
| 34 | + - noun: Ghose filter violations |
| 35 | + - noun: violations of the Ghose filter |
| 36 | + - id: num_lead_likeness_violations |
| 37 | + description: Lead likeness filter violations |
| 38 | + type: ordinal |
| 39 | + significant_digits: 0 |
| 40 | + names: |
| 41 | + - noun: lead likeness filter violations |
| 42 | + - noun: violations of the lead likeness filter |
| 43 | + - id: num_lipinski_violations |
| 44 | + description: Lipinski filter violations |
| 45 | + type: ordinal |
| 46 | + significant_digits: 0 |
| 47 | + names: |
| 48 | + - noun: Lipinski rule violations |
| 49 | + - noun: violations of the Lipinski rules |
| 50 | + - id: molecular_mass |
| 51 | + description: Molecular mass |
| 52 | + type: continuous |
| 53 | + units: g/mol |
| 54 | + names: |
| 55 | + - noun: molecular mass |
| 56 | + - noun: molecular weight |
| 57 | + - id: num_carbon_atoms |
| 58 | + description: Number of carbon atoms |
| 59 | + type: ordinal |
| 60 | + significant_digits: 0 |
| 61 | + names: |
| 62 | + - noun: carbon atoms |
| 63 | + - id: num_oxygen_atoms |
| 64 | + description: Number of oxygen atoms |
| 65 | + type: ordinal |
| 66 | + significant_digits: 0 |
| 67 | + names: |
| 68 | + - noun: oxygen atoms |
| 69 | +identifiers: |
| 70 | + - id: SMILES |
| 71 | + type: SMILES |
| 72 | + description: SMILES |
| 73 | +license: CC BY 4.0 |
| 74 | +links: |
| 75 | + - url: https://ehp.niehs.nih.gov/doi/full/10.1289/EHP8495#supplementary-materials |
| 76 | + description: corresponding publication |
| 77 | +num_points: 9032 |
| 78 | +bibtex: |
| 79 | + - |- |
| 80 | + @article{Mansouri_2021, title={CATMoS: Collaborative Acute Toxicity Modeling Suite}, |
| 81 | + volume={129}, |
| 82 | + ISSN={1552-9924}, |
| 83 | + url={http://dx.doi.org/10.1289/EHP8495}, |
| 84 | + DOI={10.1289/ehp8495}, |
| 85 | + number={4}, |
| 86 | + journal={Environmental Health Perspectives}, |
| 87 | + publisher={Environmental Health Perspectives}, |
| 88 | + author={Mansouri, Kamel and Karmaus, Agnes L. and Fitzpatrick, Jeremy |
| 89 | + and Patlewicz, Grace and Pradeep, Prachi and Alberga, Domenico and |
| 90 | + Alepee, Nathalie and Allen, Timothy E.H. and Allen, Dave and Alves, Vinicius M. |
| 91 | + and Andrade, Carolina H. and Auernhammer, Tyler R. and Ballabio, Davide and |
| 92 | + Bell, Shannon and Benfenati, Emilio and Bhattacharya, Sudin and |
| 93 | + Bastos, Joyce V. and Boyd, Stephen and Brown, J.B. and Capuzzi, Stephen J. and |
| 94 | + Chushak, Yaroslav and Ciallella, Heather and Clark, Alex M. and |
| 95 | + Consonni, Viviana and Daga, Pankaj R. and Ekins, Sean and Farag, Sherif and |
| 96 | + Fedorov, Maxim and Fourches, Denis and Gadaleta, Domenico and Gao, Feng and |
| 97 | + Gearhart, Jeffery M. and Goh, Garett and Goodman, Jonathan M. and |
| 98 | + Grisoni, Francesca and Grulke, Christopher M. and Hartung, Thomas and |
| 99 | + Hirn, Matthew and Karpov, Pavel and Korotcov, Alexandru and |
| 100 | + Lavado, Giovanna J. and Lawless, Michael and Li, Xinhao and |
| 101 | + Luechtefeld, Thomas and Lunghini, Filippo and Mangiatordi, Giuseppe F. and |
| 102 | + Marcou, Gilles and Marsh, Dan and Martin, Todd and Mauri, Andrea and |
| 103 | + Muratov, Eugene N. and Myatt, Glenn J. and Nguyen, Dac-Trung and |
| 104 | + Nicolotti, Orazio and Note, Reine and Pande, Paritosh and |
| 105 | + Parks, Amanda K. and Peryea, Tyler and Polash, Ahsan H. and |
| 106 | + Rallo, Robert and Roncaglioni, Alessandra and Rowlands, Craig and |
| 107 | + Ruiz, Patricia and Russo, Daniel P. and Sayed, Ahmed and Sayre, Risa and |
| 108 | + Sheils, Timothy and Siegel, Charles and Silva, Arthur C. and Simeonov, Anton and |
| 109 | + Sosnin, Sergey and Southall, Noel and Strickland, Judy and Tang, Yun and |
| 110 | + Teppen, Brian and Tetko, Igor V. and Thomas, Dennis and Tkachenko, Valery and |
| 111 | + Todeschini, Roberto and Toma, Cosimo and Tripodi, Ignacio and |
| 112 | + Trisciuzzi, Daniela and Tropsha, Alexander and Varnek, Alexandre and |
| 113 | + Vukovic, Kristijan and Wang, Zhongyu and Wang, Liguo and |
| 114 | + Waters, Katrina M. and Wedlake, Andrew J. and Wijeyesakere, Sanjeeva J. and |
| 115 | + Wilson, Dan and Xiao, Zijun and Yang, Hongbin and Zahoranszky-Kohalmi, Gergely and |
| 116 | + Zakharov, Alexey V. and Zhang, Fagen F. and Zhang, Zhen and Zhao, Tongan and |
| 117 | + Zhu, Hao and Zorn, Kimberley M. and Casey, Warren and Kleinstreuer, Nicole C.}, |
| 118 | + year={2021}, month=apr } |
| 119 | +templates: |
| 120 | + - The {#molecule|chemical|compound!} with the {SMILES__description} {#representation of |!}{SMILES#} {#shows|exhibits|displays!} an {CATMoS_LD50_mgkg__names__noun} of {CATMoS_LD50_mgkg#} {CATMoS_LD50_mgkg__units}. |
| 121 | + - The {#molecule|chemical|compound!} with the {SMILES__description} {#representation of |!}{SMILES#} {#shows|exhibits|displays!} a {log10_LD50__names__noun} of {log10_LD50#} {log10_LD50__units}. |
| 122 | + - | |
| 123 | + Task: Determine the acute oral toxicity and molecular properties of a {#molecule|chemical|compound!} given the {SMILES__description}. |
| 124 | + Input: {SMILES#} |
| 125 | + Desired Output: {CATMoS_LD50_mgkg__names__noun}, {log10_LD50__names__noun}, {num_ghose_violations__names__noun}, {num_lead_likeness_violations__names__noun}, {num_lipinski_violations__names__noun}, {molecular_mass__names__noun}, {num_carbon_atoms__names__noun}, {num_oxygen_atoms__names__noun} |
| 126 | + Output: {CATMoS_LD50_mgkg#} {CATMoS_LD50_mgkg__units}, {log10_LD50#} {log10_LD50__units}, {num_ghose_violations#}, {num_lead_likeness_violations#}, {num_lipinski_violations#}, {molecular_mass#} {molecular_mass__units}, {num_carbon_atoms#}, {num_oxygen_atoms#} |
| 127 | + - | |
| 128 | + Context: You are {#an assistant|researcher|scientist!} in a pharmaceutical company. Your {#boss|superior|department head!} has asked you to {#design|create|synthesize!} a new drug. |
| 129 | + User: The {#drug|compound|chemical!} should have a {CATMoS_LD50_mgkg__names__noun} of {CATMoS_LD50_mgkg#} {CATMoS_LD50_mgkg__units}, {num_ghose_violations#} {num_ghose_violations__names__noun}, {num_lead_likeness_violations#} {num_lead_likeness_violations__names__noun}, {num_lipinski_violations#} {num_lipinski_violations__names__noun}, {molecular_mass#} {molecular_mass__names__noun} {molecular_mass__units}, {num_carbon_atoms#} {num_carbon_atoms__names__noun}, and {num_oxygen_atoms#} {num_oxygen_atoms__names__noun}. |
| 130 | + Assistant: {#Happy to help!|Sure!|Of course!} The {#molecule|chemical|compound!} with the {SMILES__description} {#representation of |!}{SMILES#} {#shows|exhibits|displays!} the desired properties. |
| 131 | + - | |
| 132 | + User: I need a {#drug|compound|chemical!} with a {log10_LD50__names__noun} of {log10_LD50#} {log10_LD50__units}. |
| 133 | + Assistant: {#Happy to help!|Sure!|Of course!} Can you provide me with more {#constraints|details|information!}? |
| 134 | + User: The {#drug|compound|chemical!} should have {num_ghose_violations#} {num_ghose_violations__names__noun}, {num_lead_likeness_violations#} {num_lead_likeness_violations__names__noun}, {num_lipinski_violations#} {num_lipinski_violations__names__noun}, {num_carbon_atoms#} {num_carbon_atoms__names__noun}, and {num_oxygen_atoms#} {num_oxygen_atoms__names__noun}. |
| 135 | + Assistant: The {#molecule|chemical|compound!} with the {SMILES__description} {#representation of |!}{SMILES#} {#shows|exhibits|displays!} the desired properties. |
| 136 | + - | |
| 137 | + User: I need a {#drug|compound|chemical!} with a {CATMoS_LD50_mgkg__names__noun} of {CATMoS_LD50_mgkg#} {CATMoS_LD50_mgkg__units}. |
| 138 | + Assistant: {#Happy to help!|Sure!|Of course!} Can you provide me with more {#constraints|details|information!}? |
| 139 | + User: The {#drug|compound|chemical!} should have a {num_carbon_atoms#} {num_carbon_atoms__names__noun}, {num_oxygen_atoms#} {num_oxygen_atoms__names__noun}, and a {molecular_mass__names__noun} of {molecular_mass#} {molecular_mass__units}. Could you please only provide me with the {SMILES__description} and return no other information? |
| 140 | + Assistant: {SMILES#} |
| 141 | + - | |
| 142 | + User: I am looking for a {#drug|compound|chemical!} with a {log10_LD50__names__noun} of {log10_LD50#} {log10_LD50__units}. |
| 143 | + Assistant: {#That's interesting!|Interesting!|I see!} Can you provide me with more {#constraints|details|information!}? |
| 144 | + User: The {#drug|compound|chemical!} should have {num_ghose_violations#} {num_ghose_violations__names__noun}, {num_lead_likeness_violations#} {num_lead_likeness_violations__names__noun}, {num_lipinski_violations#} {num_lipinski_violations__names__noun}, {num_carbon_atoms#} {num_carbon_atoms__names__noun}, and {num_oxygen_atoms#} {num_oxygen_atoms__names__noun}. Please return only the {SMILES__description} wrapped as follows [ANSWER]<SMILES>[/ANSWER]. |
| 145 | + Assistant: [ANSWER]{SMILES#}[/ANSWER] |
0 commit comments