PARreto Task Inference Analysis on Multi-Omic Data

Objectives:

  • Analyzing multi-omic data using Pareto task inference method 
  • Characterize archetypes of health states and reveal the trade-offs in the human wellness space
  • Find biomarkers for early detection of transitions from health to disease state

It is a challenge to answer questions like: Why some people develop a disease, react to a specific treatment and/or develop severe side-effects while others don’t. In order to explain these occurrences, one has to take a holistic approach and study the body physiology from a systems level perspective. Longitudinal multi-omics measurements together with genetics, on a large population, can serve such a purpose and help in predicting, reasoning, and preventing diseases. 

We have developed infrastructure to collect longitudinal Personalized Dense Dynamic Data clouds (PD3 clouds) on thousands of individuals, which include genetics and longitudinal measurements of clinical labs, microbiome, metabolome, proteome, and self-reported data.  

The value of these extremely high-dimensional data clouds is clear; however, it also comprises a challenge in data analysis and interpretation. 

One way to reduce data dimensionality is called Pareto Task Inference (PARTI, Hart et al. 2015). We used this method to analyze the clinical labs and found that the data falls on a significant tetrahedron. The 4 vertices are archetypes that specialize in a certain task. Using all other datatypes, we identified enriched traits next to every archetype and revealed the underline tradeoffs that shape the data. 

This distinctive analysis uncovers unexpected relationships between datasets such as metabolomics, proteomics and clinical labs, and helps in interconnecting the different data-types to characterize different states of human health.

Current Project Leads:

Anat ZimmerNathan Price