A modular command-line interface and GUI for mapping and enriching ontologies using BioPortal and OLS APIs.
This tool provides a simple, user-friendly interface for ontology concept lookup and mapping across multiple biomedical ontologies. It supports both command-line interface (CLI) and graphical user interface (GUI) modes.
- Multi-ontology support: 24+ ontologies including MONDO, HP, NCIT, DOID, CHEBI, GO, SNOMEDCT, and more
- Dual API integration: BioPortal and OLS APIs with intelligent fallback
- Interactive search: Real-time concept lookup with user-friendly selection
- TTL file processing: Parse and enrich existing ontology files
- Batch processing: Handle multiple concepts efficiently
- GUI interface: User-friendly graphical interface for non-technical users
- Result comparison: Compare results from different ontology services
- Python 3.8 or higher
- pip package manager
# Clone the repository
git clone https://github.com/your-username/ontology-mapping-tool.git
cd ontology-mapping-tool
# Install dependencies
pip install -r requirements.txt
# Install the package
pip install -e .
-
Copy the environment template:
cp .env.template .env
-
Edit
.env
and add your API keys:- Get a BioPortal API key from: https://bioportal.bioontology.org/account
- (Optional) Get a UMLS API key from: https://uts.nlm.nih.gov/uts/profile
python main.py --search "breast cancer"
python main.py --input ontology.ttl --output enriched_ontology.ttl
python main.py --interactive
python main.py --batch concepts.txt --output results.json
Launch the GUI:
python gui/launch_gui.py
Or use the demo interface:
python gui/demo_gui.py
The tool supports lookup across 24+ major biomedical ontologies:
- MONDO: Monarch Disease Ontology
- HP: Human Phenotype Ontology
- NCIT: National Cancer Institute Thesaurus
- DOID: Disease Ontology
- CHEBI: Chemical Entities of Biological Interest
- GO: Gene Ontology
- SNOMEDCT: Systematized Nomenclature of Medicine Clinical Terms
- ICD10CM, ICD11: International Classification of Diseases
- LOINC: Logical Observation Identifiers Names and Codes
- OMIM: Online Mendelian Inheritance in Man
- ORDO: Orphanet Rare Disease Ontology
- And many more...
The tool is organized into modular components:
ontology-mapping-tool/
├── cli/ # Command-line interface
│ ├── main.py # Main CLI entry point
│ └── interface.py # CLI interface logic
├── core/ # Core functionality
│ ├── parser.py # TTL file parsing
│ ├── lookup.py # Concept lookup orchestration
│ └── generator.py # Output generation
├── services/ # API services
│ ├── bioportal.py # BioPortal API client
│ ├── ols.py # OLS API client
│ └── comparator.py # Result comparison
├── config/ # Configuration
│ └── ontologies.py # Ontology definitions
├── utils/ # Utilities
│ ├── helpers.py # Helper functions
│ └── loading.py # Loading animations
└── gui/ # Graphical interface
├── launch_gui.py # GUI launcher
├── bioportal_gui.py # Main GUI application
└── demo_gui.py # Demo interface
python main.py --search "diabetes mellitus"
python main.py --input disease_ontology.ttl --output enhanced_ontology.ttl
Create a file concepts.txt
:
breast cancer
diabetes mellitus
hypertension
Then run:
python main.py --batch concepts.txt --output results.json
Main class for performing concept lookups across multiple ontologies.
Client for interacting with the BioPortal API.
Client for interacting with the OLS (Ontology Lookup Service) API.
Parser for TTL (Turtle) ontology files.
Edit config/ontologies.py
to customize:
- Supported ontologies
- API endpoints
- Search strategies
- Result filtering
BIOPORTAL_API_KEY
: Your BioPortal API key (required)UMLS_API_KEY
: Your UMLS API key (optional)
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Run the test suite:
python -m pytest tests/
This project is licensed under the MIT License - see the LICENSE file for details.
For questions, issues, or contributions:
- Open an issue on GitHub
- Check the documentation in the
docs/
directory - Review the example scripts in
examples/
- BioPortal team for providing the ontology API
- OLS team for the Ontology Lookup Service
- The broader biomedical ontology community
If you use this tool in your research, please cite:
[Your citation information here]
Note: This tool is designed to complement existing ontology mapping frameworks like SSSOM-py by providing a simple, user-friendly interface for concept lookup and initial mapping tasks.
helpers.py
: Common helper functions (text cleaning, deduplication, etc.)
ontologies.py
: Ontology definitions, mappings, and search strategies
interface.py
: Main CLI interface and argument parsingmain.py
: CLI entry point and error handling
# List available ontologies
python run_cli.py --list-ontologies
# Query a single term
python run_cli.py --single-word "fatigue" --ontologies "HP,NCIT"
# Process a TTL file
python run_cli.py ontology.ttl --output improved.ttl
# Batch processing with pre-selected choices
python run_cli.py ontology.ttl --batch-mode selections.json
--ontologies
: Specify ontologies to search (e.g., "HP,NCIT,MONDO")--max-results
: Maximum results per search (default: 5)--disable-ols
: Use only BioPortal--disable-bioportal
: Use only OLS--comparison-only
: Run comparison without generating output--terminal-only
: Print results without creating files
Set your BioPortal API key:
export BIOPORTAL_API_KEY="your_api_key_here"
Or use the --api-key
argument.
The tool supports 24+ ontologies including:
- MONDO: Monarch Disease Ontology
- HP: Human Phenotype Ontology
- DOID: Disease Ontology
- ORDO: Orphanet Rare Disease Ontology
- SNOMEDCT: SNOMED Clinical Terms
- ICD10/ICD11: International Classification of Diseases
- LOINC: Logical Observation Identifiers Names and Codes
- CPT: Current Procedural Terminology
- GO: Gene Ontology
- CHEBI: Chemical Entities of Biological Interest
- PRO: Protein Ontology
- UBERON: Anatomical structures
python run_cli.py --single-word "cancer" --ontologies "MONDO,HP,DOID,NCIT,ORDO"
python run_cli.py --single-word "headache" --ontologies "HP,SYMP,NCIT"
python run_cli.py --single-word "aspirin" --ontologies "CHEBI,RXNORM,NCIT"
ontology_mapping/
├── bioportal_cli.py # Original monolithic file (kept for reference)
├── run_cli.py # Convenient wrapper script
├── main.py # Main entry point
├── cli/
│ ├── __init__.py
│ ├── interface.py # CLI interface and argument parsing
│ └── main.py # CLI entry point
├── core/
│ ├── __init__.py
│ ├── parser.py # TTL file parsing
│ ├── lookup.py # Concept lookup orchestration
│ └── generator.py # Ontology generation
├── services/
│ ├── __init__.py
│ ├── bioportal.py # BioPortal API client
│ ├── ols.py # OLS API client
│ └── comparator.py # Result comparison
├── utils/
│ ├── __init__.py
│ ├── loading.py # Loading animations
│ └── helpers.py # Helper functions
└── config/
├── __init__.py
└── ontologies.py # Ontology configurations
rdflib
: RDF graph processingrequests
: HTTP API callsargparse
: Command-line argument parsingthreading
: Loading bar animationsjson
: Configuration and batch processing
The original bioportal_cli.py
(1185+ lines) has been split into focused modules:
- Services separated: BioPortal and OLS clients are now independent
- Core logic isolated: Parsing, lookup, and generation are distinct
- Configuration centralized: All ontology definitions in one place
- CLI decoupled: Interface separated from business logic
- Utilities extracted: Common functions in dedicated modules
This modular approach improves:
- Maintainability: Easier to update individual components
- Testability: Each module can be tested independently
- Reusability: Components can be imported and used elsewhere
- Readability: Smaller, focused files are easier to understand
The modular structure makes it easy to add:
- New ontology services
- Additional output formats
- Enhanced comparison algorithms
- Batch processing improvements
- Web interface components