Drug-Gene Interaction Platform

Comprehensive database platform with 300K+ curated drug-gene interactions from multiple sources

Overview

Created a comprehensive drug-gene interaction platform by integrating data from 4 major databases, standardizing gene identifiers, and delivering a curated dataset for cancer research.

Key Features

  • Multi-Database Integration: Aggregated data from 4 major drug-gene interaction databases
  • Gene Identifier Standardization: Harmonized gene identifiers using HGNC and NCBI standards
  • High-Confidence Curation: Delivered 300K+ high-confidence drug-gene interactions
  • Research-Ready Dataset: Prepared data for downstream analysis in drug repurposing studies

Technologies Used

  • Languages: Python, R
  • Data Processing: Pandas, NumPy
  • Databases: HGNC, NCBI, DrugBank, DGIdb
  • Tools: Shell Script, Linux

Data Sources

  • Drug-Gene Interaction Database (DGIdb)
  • DrugBank
  • HGNC (HUGO Gene Nomenclature Committee)
  • NCBI Gene

Impact

This curated dataset supports ongoing research in cancer drug repurposing and combination therapy discovery at Huntsman Cancer Institute.