24
1 Bioinformatic Alchemy 101 Transmuting dark script matter into reusable tools Ross Lazarus BakerIDI

Toolfactory foam mar21_2013

Embed Size (px)

DESCRIPTION

Bah! Humbug! The embedded movies do not work. Gak. This slide show was NOT presented during the FOAM meeting as the PC was being used to futz with the new Cloudman instance so I could use it for the demo.

Citation preview

Page 1: Toolfactory foam mar21_2013

1

Bioinformatic Alchemy 101

Transmuting dark script

matter into reusable tools

Ross Lazarus

BakerIDI

Page 2: Toolfactory foam mar21_2013

2

Context: bioinformatic analyses

Big data; complex analyses

Repeatable, automated pipelines

Reproducibility real goal

Reproducibility is hard

Page 3: Toolfactory foam mar21_2013

3

Frameworks

Eg VGL

Local SOPs for biologists

Tools, canned workflows

Minimise opportunities for error

Maximise reproducibilty

Page 4: Toolfactory foam mar21_2013

4

In real life

90/10 rule

Need to tweak SOPs

Trivial 'disposable' scripts

Not documented or curated

Not reliably available to re-run

“Dark script matter”

Page 5: Toolfactory foam mar21_2013

5

Dark Script Matter

Outside usual VCS/pipelines

Manual =/= reproducible

Necessary evil?

Platform extensions complex

Eg Galaxy – hours of work

Page 6: Toolfactory foam mar21_2013

6

Plan

Context: Reproducible analyses

Frameworks vs Dark Scripts

Alchemy: script to Galaxy

tool Demonstration

Summary

Conclusions

Page 7: Toolfactory foam mar21_2013

7

Galaxy Tool Factory

An installable Galaxy tool

Runs scripts: Python,R,Perl,sh

Generates new Galaxy tools

Tool code wraps the script

Minutes – not hours

Page 8: Toolfactory foam mar21_2013

8

Galaxy Tool Shed

Separate server

Stores/serves Galaxy tools

Admin can install to Galaxy

Mercurial VCS archives

Explicit tool versioning

Sharing and reproducibility

Page 9: Toolfactory foam mar21_2013

Demo 1: Install the Tool Factory

Page 10: Toolfactory foam mar21_2013

Demo 2: Create a new tool

Page 11: Toolfactory foam mar21_2013

Demo 3: Quick install and test

Page 12: Toolfactory foam mar21_2013

14

Prepare script

Python; R; Perl; Sh

Parse CL params – 1=in, 2=out

Typically workflow transformations

Arbitrary complexity

Simple example

Write transpose of a tabular file

Page 13: Toolfactory foam mar21_2013

15

Prepare/upload test data

SMALL sample input

Becomes functional test case

h1 h2 h3 h4

r11 r12 r13 r14

r21 r22 r23 r24

Page 14: Toolfactory foam mar21_2013

16

# R transpose a tabular input file and write as

# a tabular output file

ourargs = commandArgs(TRUE)

inf = ourargs[1]

outf = ourargs[2]

inp = read.table(inf,head=F,row.names=NULL,sep='\t')

outp = t(inp)

write.table(outp,outf,quote=FALSE, sep="\t",row.names=F,col.names=FALSE)

Page 15: Toolfactory foam mar21_2013

17

Demo part 1

As an admin, test run the code

Page 16: Toolfactory foam mar21_2013

18

Use Redo button; Generate

When working right

Use Redo to save retyping

Select Generate option

Provide tool ID, help text

Execute

Expect a toolfactory.gz in history

Copy link (floppy disk icon)

Page 17: Toolfactory foam mar21_2013

19

What's in the toolshed.gz ?

A gzip'd mercurial tool repository (!)

Auto generated tool XML file

Auto generated tool python wrapper

Functional test case - the sample data

Familiar Galaxy tool for all users

Executes your script over their data

Interoperably inside Galaxy

Page 18: Toolfactory foam mar21_2013

20

Upload TS gzip to new repository

Upload to any tool shed

Create new repo; sensible name!

Choose Upload files to new repo

Paste URL (floppydisk save icon)

New tool ready to install

Page 19: Toolfactory foam mar21_2013

21

Install and Test New Tool

Back to Galaxy admin interface

Browse local tool shed

Choose new tool

Install to local Galaxy

Try it out

Run functional test

Page 20: Toolfactory foam mar21_2013

22

Summary

GTF = script to tool in minutes

Integrated with Galaxy and TS

Simple workflow components

If needed, generate simple tool

Then add parameters manually

Page 21: Toolfactory foam mar21_2013

23

Tool Factory Operation Guide

Script

(Python,R,

perl, sh)

GalaxyTool Factory

Tool Form;

Paste script;

Generate TS gzip;

Copy download link for

pasting

Upload/paste

Sample Input for

functional testTest run;

Check outputs;

Rerun/fix;

Tool Shed

Create new repository.

Upload files – paste TS gzip

link and upload

Install new tool from toolshed

from Galaxy admin page;

Test; Functional test;

Page 22: Toolfactory foam mar21_2013

24

GALAXY

http://usegalaxy.org

Page 23: Toolfactory foam mar21_2013

25

Generate a new Galaxy tool

Galaxy Tool Factory

From a python, R, Perl or bash script

# transpose a tabular input file and write as a tabular output file

ourargs = commandArgs(T)

inf = ourargs[1]

outf = ourargs[2]

inp = read.table(inf,head=F,row.names=NULL,sep='\t')

outp = t(inp)

write.table(outp,outf,quote=F, sep="\t",row.names=F,col.names=F)

Using a Galaxy tool

Via a Tool Shed

Page 24: Toolfactory foam mar21_2013

26

Tool Factory Operation Guide

Script – R,

perl, python

GalaxyTool Factory

Tool Form;

Paste script;

Generate TS gzip;

Copy download link for

pasting

Upload/paste

Sample Input for

functional testTest run;

Check outputs;

Rerun/fix;

Tool Shed

Create new repository.

Upload files – paste TS gzip

link and upload

Install new tool from toolshed

from Galaxy admin page;

Test; Functional test;