8
Practical Education in Bioinformatics

Pine education-platform

Embed Size (px)

Citation preview

Page 1: Pine education-platform

Practical Education in Bioinformatics

Page 2: Pine education-platform

Only 4 hours per

week to do Omics

Analysis

5 hours per week for

lunch

Typical Lab Student DayData Generation Biological Interpretation

How can a lab infer meaning from experimental data?

The learning curve is steep and mistakes waste valuable, very limited researcher time. Many

research teams have the capacity and budget to do sequencing, but do not use the opportunity

because the data cannot be used

Who will analyze?

Page 3: Pine education-platform

Each sample has to be assigned to a “workflow” or a “pipeline” to be analyzed by a series of algorithms. Most pipelines include error correction, annotation through querying existing databases and statistical analysis of data quality. To compare multiple samples, intimate knowledge of experiment design and goals is needed, therefore the research team itself is the best candidate to perform the analysis.

Even processed data is difficult to interpret due to complexity of relationships and large data sets. There is a need for machine learning algorithms and visualization tools to be available.

Seven Bridges: Expanded view of input complexity

NCBI GALAXY tools: Expanded view of input complexity

Complexity hinders usability (Even if it’s “visual”)

Open Source (Galaxy) Commercial (Seven Bridges)

Challenge: Pipelines of Algorithms with Different Inputs and Outputs

Page 4: Pine education-platform

Integration of analysis types

T-BioInfo Data Analysis Platform

One environment for all types of

data and analysis

T-BioInfo, is a cloud based or locally hosted suite of Big Data analysis tools. By hiding complicated mathematical algorithms behind a user-friendly interface, T-BioInfo enables faster and easier analysis, integration, and visualization of different types of big data. As an answer to a multi-source heterogeneous dataset analysis need, T-BioInfo combines many data types as well as many industry-standard and novel algorithms into flexible, interactive, visual pipelines.

The platform is designed to eliminate dependency on bioinformaticians and streamline the way big data is collected, analyzed and interpreted.

Intuitive Interface

“one-button” approach to

most areas of analysis

Page 5: Pine education-platform

Project-based learning: Learn to use, not just “know”

• “wet” biology• experiment design• sequencing technology• data analysis• data mining• interpretation

Page 6: Pine education-platform

College Level: Public Domain Projects - T-Bioinfo learning environment’s goal is to

support collaborative learning in molecular biology and bioinformatics at the first years in university.

Secondary Level: Simulation of Bio Process

Using the platform’s simulator, we can generate RNA-seq data, data would contain errors, as often seen in RNA-seq data. The objective is for the user to reproduce the original model. Thus, one can gain conceptual understanding of the bioinformaticians typical work and methodology.

Vision for Practical Education Using The T-BioInfo Platform in the Classroom

-T-Bioinfo learning environment will include the known biological model and data is generated by the molecular biology process simulator

http://www.nature.com/news/2009/091125/full/462408a.html

Objective: learner reconstructs the

simulated process

Why this problem is important and how does it relate to genomics (for example there is a gene responsible for this disease)Problem: identify genes causing ASDSolution: T-Bioinformatics Platform that allows analysis of data and identification of the gene causing ASD

Page 7: Pine education-platform

Grosmannia clavigera was grown under different conditions and data was sequenced from samples collected at different time points in order to identify the enzymatic pathways involved in monoterpene detoxication and utilization as an energy source.

SAMPLES

Example Project: Grosmannia Clavigera (RNA-Seq and Factor Analysis)

Sample Collection (Experiment Design) Logic of the RNA-Seq Analysis

Biological Interpretation Data Mining: Factor Analysis

Page 8: Pine education-platform

• Virtual Narrative connects multiple areas of analysis

• Real Scientific Public Domain Projects, invitation to start your own

• Interpretation of analysis leads to creating new narratives

• Ability to be apart of an ongoing bioinformatic “conversation”

• Assumes no knowledge of programming

• Explains both computational ideas along with visual tools

• Assumes few computational prerequisites

• Introduces computational ideas within a project instead of an abstract

Situated Learning- learning happens within an authentic activity, context and culture: participation in community of practice