Upload
ross-lazarus
View
67
Download
0
Embed Size (px)
DESCRIPTION
Bah! Humbug! The embedded movies do not work. Gak. This slide show was NOT presented during the FOAM meeting as the PC was being used to futz with the new Cloudman instance so I could use it for the demo.
Citation preview
1
Bioinformatic Alchemy 101
Transmuting dark script
matter into reusable tools
Ross Lazarus
BakerIDI
2
Context: bioinformatic analyses
Big data; complex analyses
Repeatable, automated pipelines
Reproducibility real goal
Reproducibility is hard
3
Frameworks
Eg VGL
Local SOPs for biologists
Tools, canned workflows
Minimise opportunities for error
Maximise reproducibilty
4
In real life
90/10 rule
Need to tweak SOPs
Trivial 'disposable' scripts
Not documented or curated
Not reliably available to re-run
“Dark script matter”
5
Dark Script Matter
Outside usual VCS/pipelines
Manual =/= reproducible
Necessary evil?
Platform extensions complex
Eg Galaxy – hours of work
6
Plan
Context: Reproducible analyses
Frameworks vs Dark Scripts
Alchemy: script to Galaxy
tool Demonstration
Summary
Conclusions
7
Galaxy Tool Factory
An installable Galaxy tool
Runs scripts: Python,R,Perl,sh
Generates new Galaxy tools
Tool code wraps the script
Minutes – not hours
8
Galaxy Tool Shed
Separate server
Stores/serves Galaxy tools
Admin can install to Galaxy
Mercurial VCS archives
Explicit tool versioning
Sharing and reproducibility
Demo 1: Install the Tool Factory
Demo 2: Create a new tool
Demo 3: Quick install and test
14
Prepare script
Python; R; Perl; Sh
Parse CL params – 1=in, 2=out
Typically workflow transformations
Arbitrary complexity
Simple example
Write transpose of a tabular file
15
Prepare/upload test data
SMALL sample input
Becomes functional test case
h1 h2 h3 h4
r11 r12 r13 r14
r21 r22 r23 r24
16
# R transpose a tabular input file and write as
# a tabular output file
ourargs = commandArgs(TRUE)
inf = ourargs[1]
outf = ourargs[2]
inp = read.table(inf,head=F,row.names=NULL,sep='\t')
outp = t(inp)
write.table(outp,outf,quote=FALSE, sep="\t",row.names=F,col.names=FALSE)
18
Use Redo button; Generate
When working right
Use Redo to save retyping
Select Generate option
Provide tool ID, help text
Execute
Expect a toolfactory.gz in history
Copy link (floppy disk icon)
19
What's in the toolshed.gz ?
A gzip'd mercurial tool repository (!)
Auto generated tool XML file
Auto generated tool python wrapper
Functional test case - the sample data
Familiar Galaxy tool for all users
Executes your script over their data
Interoperably inside Galaxy
20
Upload TS gzip to new repository
Upload to any tool shed
Create new repo; sensible name!
Choose Upload files to new repo
Paste URL (floppydisk save icon)
New tool ready to install
21
Install and Test New Tool
Back to Galaxy admin interface
Browse local tool shed
Choose new tool
Install to local Galaxy
Try it out
Run functional test
22
Summary
GTF = script to tool in minutes
Integrated with Galaxy and TS
Simple workflow components
If needed, generate simple tool
Then add parameters manually
23
Tool Factory Operation Guide
Script
(Python,R,
perl, sh)
GalaxyTool Factory
Tool Form;
Paste script;
Generate TS gzip;
Copy download link for
pasting
Upload/paste
Sample Input for
functional testTest run;
Check outputs;
Rerun/fix;
Tool Shed
Create new repository.
Upload files – paste TS gzip
link and upload
Install new tool from toolshed
from Galaxy admin page;
Test; Functional test;
24
GALAXY
http://usegalaxy.org
25
Generate a new Galaxy tool
Galaxy Tool Factory
From a python, R, Perl or bash script
# transpose a tabular input file and write as a tabular output file
ourargs = commandArgs(T)
inf = ourargs[1]
outf = ourargs[2]
inp = read.table(inf,head=F,row.names=NULL,sep='\t')
outp = t(inp)
write.table(outp,outf,quote=F, sep="\t",row.names=F,col.names=F)
Using a Galaxy tool
Via a Tool Shed
26
Tool Factory Operation Guide
Script – R,
perl, python
GalaxyTool Factory
Tool Form;
Paste script;
Generate TS gzip;
Copy download link for
pasting
Upload/paste
Sample Input for
functional testTest run;
Check outputs;
Rerun/fix;
Tool Shed
Create new repository.
Upload files – paste TS gzip
link and upload
Install new tool from toolshed
from Galaxy admin page;
Test; Functional test;