Drug-Gene Interaction Platform
Comprehensive database platform with 300K+ curated drug-gene interactions from multiple sources
Overview
Created a comprehensive drug-gene interaction platform by integrating data from 4 major databases, standardizing gene identifiers, and delivering a curated dataset for cancer research.
Key Features
- Multi-Database Integration: Aggregated data from 4 major drug-gene interaction databases
- Gene Identifier Standardization: Harmonized gene identifiers using HGNC and NCBI standards
- High-Confidence Curation: Delivered 300K+ high-confidence drug-gene interactions
- Research-Ready Dataset: Prepared data for downstream analysis in drug repurposing studies
Technologies Used
- Languages: Python, R
- Data Processing: Pandas, NumPy
- Databases: HGNC, NCBI, DrugBank, DGIdb
- Tools: Shell Script, Linux
Data Sources
- Drug-Gene Interaction Database (DGIdb)
- DrugBank
- HGNC (HUGO Gene Nomenclature Committee)
- NCBI Gene
Impact
This curated dataset supports ongoing research in cancer drug repurposing and combination therapy discovery at Huntsman Cancer Institute.