CosIng-Toxicity

This project contains source code for research into the automation of literature reviews using Python and NLTK. The CosIng-Toxicity case study in particular uses data about cosmetic ingredients to search for research into their toxicity.

This project was carried out in collaboration with the Kanazawa University Practical Pharmacology Laboratory. The goal was to verify an automated literature review process using natural language processing (NLP).

The data for this project comes from the following resources:

Cosmetic ingredient database (Cosing) - Ingredients and Fragrance inventory for an inclusive list of cosmetic ingredients
PubChem PUG View web service to collect data on each ingredient's therapeutic uses and toxicity
PubMed E-utilities to search research papers for adverse effects on skin related to each compound
Natural Language Toolkit (NLTK) to process the acquired paper abstracts for relevance

A presentation slideshow of this research is available on Slideshare at the link below.

A Natural Language Processing Approach to Reviewing Research Abstracts from Robert Songer

Research literature reviews have largely moved online and researchers must search through large quantities of digital documents to find research related to their academic pursuits. With recent developments in Natural Language Processing (NLP), computers can perform most of the searching and reduce the amount of time it takes researchers to find the papers they need. In this report, we introduce three basic NLP techniques (tokenization, frequency distributions, and in-sentence collocations) for searching the written texts of research abstracts downloaded from an online database. Real examples written in the Python programming language are provided along with a discussion of their efficacy in a project at Kanazawa University where an online research database was searched for research related to the adverse effects of hundreds of pharmaceutical compounds.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
.gitignore		.gitignore
README.md		README.md
abstracts.py		abstracts.py
main.py		main.py
textmining.py		textmining.py
usage.py		usage.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CosIng-Toxicity

About

Releases

Packages

Languages

rsonger/CosIng-Toxicity

Folders and files

Latest commit

History

Repository files navigation

CosIng-Toxicity

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages