Denodo Technologies > Case Studies > Curing Advanced Data Ailments Using Data Virtualization to Aid Worldwide War on Cancer

Curing Advanced Data Ailments Using Data Virtualization to Aid Worldwide War on Cancer

Denodo Technologies Logo
Company Size
1,000+
Region
  • America
Country
  • United States
Product
  • Data Virtualization Platform
Tech Stack
  • XML
  • Oracle
  • MySQL DB
  • FTP
  • CSV
Implementation Scale
  • Enterprise-wide Deployment
Impact Metrics
  • Cost Savings
  • Productivity Improvements
Technology Category
  • Application Infrastructure & Middleware - Data Exchange & Integration
  • Platform as a Service (PaaS) - Data Management Platforms
Applicable Industries
  • Healthcare & Hospitals
  • Life Sciences
Applicable Functions
  • Product Research & Development
  • Quality Assurance
Use Cases
  • Predictive Quality Analytics
Services
  • Data Science Services
  • System Integration
About The Customer
The National Institutes of Health (NIH) is the nation’s medical research agency and a component of the U.S. Department of Health and Human Services. It includes 27 Institutes and Centers and is the primary federal agency conducting and supporting basic, clinical, and translational medical research. NIH investigates the causes, treatments, and cures for both common and rare diseases. Two of the 27 institutes that make up NIH are The National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI), which recently joined forces to execute on a project known as The Cancer Genome Atlas (TCGA). The TCGA mission is to catalog the genetic mutations responsible for cancer using genome sequencing and bioinformatics.
The Challenge
The National Institutes of Health (NIH) faced significant obstacles in reliably and efficiently moving large volumes of cancer genome data from The Cancer Genome Atlas (TCGA) to the International Cancer Genome Consortium (ICGC). This process involved transforming the TCGA data to meet ICGC format requirements and then periodically uploading the data into ICGC servers. The transformation was initially accomplished using PERL scripts, but NIH faced challenges with this process. It was not scalable, had high costs, and was inaccurate due to limited connectivity to data sources leading to redundant copies of data, slower processes and greater chance of errors.
The Solution
The NIH used data virtualization to connect to the different sources of the genome data, apply transformations, produce the final data sets and periodically upload these data sets into the ICGC servers. The connectors within the data virtualization platform provided a normalized view of the patient and donor data stored in XML files, sample test results in Oracle and TCGA-ICGC mapping data in MySQL DB. The transformation process included three important steps: aggregating the patient and test data, converting this data into the ICGC format using the mapping information, and then creating the final output files in CSV format. Lastly, the scheduler within the data virtualization platform executed an FTP process once every quarter and then uploaded the files into the ICGC servers.
Operational Impact
  • Increased scalability: Include larger genome data sets due to the creation of replicable generic workflows and the platform's advanced performance capabilities.
  • Increased efficiency: Faster development and modification of TCGA - ICGC transformation processes because of the platform's diverse connectivity and publishing capabilities.
  • Increased accuracy: Minimized replication and manual intervention led to the most current versions of data and processes being used to create the output files, leading to greater accuracy in the final data.

Case Study missing?

Start adding your own!

Register with your work email and create a new case study profile for your business.

Add New Record

Related Case Studies.

Contact us

Let's talk!
* Required
* Required
* Required
* Invalid email address
By submitting this form, you agree that IoT ONE may contact you with insights and marketing messaging.
No thanks, I don't want to receive any marketing emails from IoT ONE.
Submit

Thank you for your message!
We will contact you soon.