74
http://www.pdbj.org/ Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October 16, 2006 Integrated Web Service for Database Queries and Computations Grid Web Services for Analog Queries to Protein 3D Structures

Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Embed Size (px)

Citation preview

Page 1: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

http://www.pdbj.org/

Haruki Nakamura

Institute for Protein Research,

Lab. Protein Informatics, Osaka University

PRAGMA 11 Workshop, Osaka Univ., on October 16, 2006

Integrated Web Service for Database Queries and Computations

Grid Web Services for Analog Queries to Protein 3D Structures

Page 2: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

BioGrid at Osaka http://www.biogrid.jp

Started from 2002Goals:Grid Technology

Development for: Biotechnology (drug discovery) and

Medical Sciences.Leader: Shinji Shimojo

(CMC, Osaka Univ.)Government Support (M

EXT): 5yearsH. NakamuraK. YamaguchiT. Takada

H. Matsuda

T. Akiyama

S. ShimojoS. Date

Integration of Data Grid and Computing Grid

to access Integrated and Standardized Databases with High-performance Computers, from anywhere and with low cost.

Page 3: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Less Conceputal(Formal)

Highly Conceptual

Fine

Scale

Conceptual Level

Large

Literature

3D-3D-CordinatesCordinates

Function

Sequence

Molecular InteractionNetwork

Atom

Molecule

Organelle

Cell

Tissue

Organ

Organism

MassAnalysis

Medical

DrugData

SugarChain

Swiss-Prot

MEDLINE

KEGG

OMIM

Gene Ontology

PDB

InterPro, Pfam

MDDR

DIPGeneExpression

ChemicalStructure

PubChem, Ligand, ChEBI

DDBJ/EMBL/GenBank

H-Inv, FANTOM

GEO

Databases in Life Sciencesnearly 1,000 DBs

Page 4: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

BioDataGrid by Hideo Matsuda at Osaka Univ.

Page 5: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

BioGrid at Osaka http://www.biogrid.jp

Started from 2002Goals:Grid Technology

Development for: Biotechnology (drug discovery) and

Medical Sciences.Leader: Shinji Shimojo

(CMC, Osaka Univ.)Government Support (M

EXT): 5yearsH. NakamuraK. YamaguchiT. Takada

H. Matsuda

T. Akiyama

S. ShimojoS. Date

Integration of Data Grid and Computing Grid

to access Integrated and Standardized Databases with High-performance Computers, from anywhere and with low cost.

Page 6: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

BioPfuga: Biosimulation Platform United on Grid Architecture A platform where individual application progra

ms at the different levels are united to execute a hybrid computation. In particular, BioPfuga is a platform for biosimulation.

http://www.biogrid.jp/

Nakamura et al. (2004) New Generation Computing, 22, 157-166.

Page 7: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Construction of BioPfuga

・ Propose and Design Communication between program components by XML description  (BMSML: BioMolecular Simulation-ML) , and Creation of the APIs to handle it.

・ Divide programs into a set of many components

Different program modules are then implemented as the service programs based on the OGSA (Open Grid Service Architecture) mechanism.

Page 8: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

http://www.biogrid.jp/

Page 9: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Construction of BioPfuga

・ Propose and Design Communication between program components by XML description  (BMSML: BioMolecular Simulation-ML) , and Creation of the APIs to handle it.

・ Divide programs into a set of many components

Different program modules are then implemented as the service programs based on the OGSA (Open Grid Service Architecture) mechanism.

Page 10: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Quantum mechanics (QM) and Molecular mechanics (MM) simulation coupled

on BioPfuga

Coupled simulation on Grids

QM area

Active site ofan enzyme

Ligand

MM area

Htotal = HQM(x,x)

+ HQM/MM(x,y)

+ HMM(y,y)

Hamiltonian of total system

Calculate in QM area

Calculate in MM area

(Yonezawa et al. at SC2004)

Page 11: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

developed by NEC Quantum Chemistry Group

Rapid computation for huge molecular systems (20,000 basis for RHF, 1,000 basis for CASSCF and MP2), with high parallel performance

Simulation of Electronic structure (1) AMOSS Ab initio Molecular Orbital Simulation for Supercomputer

developed by Yoshifumi Fukunishi at JBIRC-AIST and Haruki Nakamura at Institute for Protein Research, Osaka Univ.

Simulation of Protein-solvent structureprestoX-basic Protein Engineering Simulator eXtended –basic versin-

Simulation of Electronic structure (2) GSO-X generalized spin density function theory (DFT)

developed by Shusuke Yamanaka & Kizashi Yamaguchi, at Osaka Univ.

856 atoms, 8672 AO’s

PKC-C1B

Page 12: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

A hybrid-QM/MM with BioPfuga on the Grid   Service@SC2004

Client controller

BMSML

User@USA, Pittsburgh

@TokyoIT

Site-C

QM(DFT)

GSO-XService

GT3

GSO-X(DFT)

@CMC Osaka

MPI50CPU

Site-B

AMOSSService

GT3

AMOSS(HF)

QM(HF)

MPI30CPU

Viewer portal

Site-A

MM(MD)

prestoXService

GT3

prestoX(MPI)

MD-Grape2

@IPR Osaka

BMSML

BMSML

8CPU

8-boards

(Special purpose computing board: max 50GFLOPS)

Page 13: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Configuration of a computing system on a Grid environment with several different program

modules.

Portal/Client Program

Service QM(HF)

ServiceQM

(DFT)

ServiceMM

User

Communicate using SOAP

Distributed computing System

Personal user

SOAP SOAP

Page 14: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

BioGrid at Osaka http://www.biogrid.jp

Started from 2002Goals:Grid Technology

Development for: Biotechnology (drug discovery) and

Medical Sciences.Leader: Shinji Shimojo

(CMC, Osaka Univ.)Government Support (M

EXT): 5yearsH. NakamuraK. YamaguchiT. Takada

H. Matsuda

T. Akiyama

S. ShimojoS. Date

Integration of Data Grid and Computing Grid

to access Integrated and Standardized Databases with High-performance Computers, from anywhere and with low cost.

Our final goal is Integration of

Data Grid and Computing Grid.

Page 15: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Portal/Client Program

Service A

ServiceC

ServiceB

User

GT GT GT

Communicate using SOAP

Distributed Computing System

Personal user

SOAP

site-A site-B site-C

GRID computing services

H. Nakamura, BioGrid2005 (March 9, 2005)

Integration of computing systems and Database query systems on a Grid

Page 16: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Portal/Client Program

Service A

ServiceC

ServiceB

User

GT GT GT

Communicate using SOAP

Distributed Computingand DB System

Personal user

SOAP

site-A site-B site-C

GRID computing services Database query services

H. Nakamura, BioGrid2005 (March 9, 2005)

Integration of computing systems and Database query systems on a Grid

Page 17: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Figure 1. A Schematic for the Proposed Portal: The portal is envisioned to be a modular web services architecture achieved by using an implementation of the Simple Object Access Protocol (SOAP http://www.w3.org/TR/soap/), that allows for seamless data exchange between the portal and all registered contributors. SOAP is a light weight protocol for exchange of information in a decentralized, distributed environment. It is an XML-based protocol, which defines a framework for representing remote procedure calls and responses. The XML wrappers basically map the individual contributing metadata format into an XML format that the data portal understands (Drinkwater et al., 2004). These services will be platform and language independent allowing other services (other portals or clients) to communicate with the data portal (see http://www.e-science.clrc.ac.uk/web/projects/dataportal/)

Proposed Portal for Multiple Databases for Protein StructuresBerman, H. et al. (2006) Structure, 14, 1211-1217.

Theoretical Model DB 1

Theoretical Model DB 2

Theoretical Model DB 3

Page 18: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

PDB (Protein Data Bank): Protein Tertiary Structure Database

Atom species and coordinates, amino acid residues, any other experimental results with raw data, experimental methods, and the conditions.

X-ray crystallography, NMR, and Electron microscope experiments.

Protein Tertiary Structure

Page 19: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

http://www.wwpdb.org/

Kim Henrick, Helen Berman, Haruki Nakamura

Page 20: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

International collaboration in wwPDB

(Berman, Henrick & Nakamura (2003) Nat. Struct. Biol. 10, 980)

RutgersUniv.UCS

DNIST

PDBjEBI

RCSB

1) Curation, data processing, and registration are made by all the members, collaborating with each other.

2) We have a single data archive, which islooked after by one “archive keeper (RCSB)”.

3) Data format and new descriptions will be discussedamong the members.

4) Members are encouraged to develop their ownbrowsers, viewers, and other APIs and services.

Page 21: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Protein Data Bank Japan

http://www.pdbj.org/

At Institute for Protein Research, Osaka Univ. since 2001 assisted from the Institute for Bioinformatics Research and Development, Japan Science and Technology Agency (BIRD-JST).

Page 22: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Processed data numbers at PDBj

We process more than 30 % deposited data of the entire world, mainly from Asian and Oceania regions

0

1000

2000

3000

4000

5000

6000

7000

0

5000

10000

15000

20000

25000

30000

35000

year

Yearly wwPDB registration numberTotal wwPDB registration number

Yearly PDBj registration number

Tot

al r

egis

trat

ion

num

ber

Yea

rly

regi

stra

tion

num

ber

1972 75 80 85 90 95 2000 2005

Page 23: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Maintain Format Standards• PDB (conventional and flat)• PDB Exchange (mmCIF)

– Mechanism for extension based on new demands• PDBML

– Derived from mmCIF– All entries converted to XML– Automatic translation from mmCIF data files an

d dictionaries– 3-styles of translation released

(Westbrook, Ito, Nakamura, Henrick, Berman (2005) Bioinformatics, 21, 988-992)

Page 24: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

PDBML: canonical XML description of PDB data, developed by the wwPDB.

(Westbrook et al. (2005) Bioinformatics, 21, 988-992)http://pdbml.pdb.org/schema/pdbx.xsd, http://pdbml.pdb.org/schema/pdbx-ext.xsd, ftp://ftp.pdbj.org/

→  No validation errors for more than 39,000 PDB file description.

Page 25: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

<PDBx:atom_siteCategory> <PDBx:atom_site id="1"> <PDBx:group_PDB>ATOM</PDBx:group_PDB> <PDBx:type_symbol>N</PDBx:type_symbol> <PDBx:label_atom_id>N</PDBx:label_atom_id> <PDBx:label_comp_id>THR</PDBx:label_comp_id> <PDBx:label_asym_id>A</PDBx:label_asym_id> <PDBx:label_entity_id>1</PDBx:label_entity_id> <PDBx:label_seq_id>1</PDBx:label_seq_id> <PDBx:Cartn_x>17.047</PDBx:Cartn_x> <PDBx:Cartn_y>14.099</PDBx:Cartn_y> <PDBx:Cartn_z>3.625</PDBx:Cartn_z> <PDBx:occupancy>1.00</PDBx:occupancy> <PDBx:B_iso_or_equiv>13.79</PDBx:B_iso_or_equiv> <PDBx:auth_seq_id>1</PDBx:auth_seq_id> <PDBx:auth_comp_id>THR</PDBx:auth_comp_id> <PDBx:auth_asym_id>A</PDBx:auth_asym_id> <PDBx:auth_atom_id>N</PDBx:auth_atom_id> <PDBx:pdbx_PDB_model_num>1</PDBx:pdbx_PDB_model_num> </PDBx:atom_site>

<atom_record id="1">ATOM 1 A A 1 1 ? . THR THR N N N 17.047 14.099 3.625 1.00 13.79</atom_record>

Full-tag description

Separated file for coordinates

Examples for the atom coordinates

Page 26: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Applications of PDBML

Browser at PDBj with the native XML DB

Database extension and annotation

SOAP (Simple Object Access Protocol) services at PDBj

Molecular graphics viewer (jV) , which directly parses PDBML, displaying the information written in XML

(Kinoshita & Nakamura (2004) Bioinformatics, 20, 1329-1330, Westbrook et al (2005) Bioinformatics, 21, 988-992)

Page 27: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 28: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 29: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Wild-card "*" canbe used: *2as,   12*,   *

Page 30: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

"and" and wild-card are available.

Page 31: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 32: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 33: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 34: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 35: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

/datablock[struct_confCategory/struct_conf/pdbx_PDB_helix_length>=“10"]/@datablockName

Search proteins, which have one or more helices,which are longer than 10.

Page 36: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 37: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

xPSSS new facility

Both XQuery and XPath are available.

Page 38: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Search for all entries and helix of length with helix of length greater than or equal to 10 residues:

Page 39: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 40: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 41: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Queries by XQuery and XPath at our new xPSSS

Page 42: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Offers interactive molecular visualization with RasMol type commands (source code free).

Can be used both as Stand-alone and as applet with Java.

Any polygons defined by XML can be displayed and manipulated.

Can parse PDBML files and

display the molecules.

http://www.pdbj.org/PDBjViewer/

PDBjViewer or jV

Page 43: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

xPSSS Database System  

PDBML

PDBMLplus

Web server

XSLT processor

downloader

Loader

Archive(RCSB-PDB/MSD-EBI

/PDBj)

Native XML-DB

PDBMLplus

PDBMLplusF

download(FTP)

FTP server

Internet

DDBJSwisProt/UniProt

PIR/GenBank/KEGG/GDB/ProTherm/EzCatDB

EBI/CSA

/CATRE

S

Function/Source

Information

Get/Input Tools

CATRESData

Annotation

Data

AddInformation

Filtering &Recostructing

PDBMLplus

PDBMLplusF

xPSSS

Manual inputfrom literatures

PDF filesfor the primary

citations

Page 44: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Protein Molecular Surface Database, eF-site (Kinoshita & Nakamura )

Protein Dynamics Database, ProMode (Wako & Endo)

Alignment of Structural Homologues, ASH (Toh)

Encyclopedia of Protein Structures, eProtS (Ito & Nakamura )

Sequence Navigator & Structure Navigator (Standley )

Development of Secondary Databases:data mining

Page 45: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Superfolds of Proteins

Page 46: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Identify Protein Function from

・ Sequence similarity ・ Fold similarity

Page 47: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Compare Protein folds/architectures

Page 48: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

DNA-binding domain of Bovine Papilloma Virus-1 E2 ( 2BOP )

RNA-binding domain of the U1A spliceosomal protein ( 1URN )

Distance Matrix

Page 49: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Map of the entire protein folds

Hou et al. (2003) Proc. Natl. Acad. Sci. USA, 100, 2386-2390.

498 SCOP domains

Dali (distance matrix alignment)http://www.ebi.ac.uk/dali/

Page 50: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

1. Rapidly generates structure neighbors for any PDB ID

2. Both an HTTP and SOAP interface are available

3. Alignments are displayed in a readable format

4. 3D coordinates can be viewed or downloaded

Structure Navigator

http://www.pdbj.org/strucnavi/

N

i

d

d

d

i

eNER

2

0

0

Structural similarity is defined by the Number of Equivalent Residues (NER1)

1Standley D, Toh H, Nakamura H. Proteins 2004;57:381-391Standley, D.M. et al. (2005)

Page 51: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Structure Navigator Real-Time Server

Blue: QueryRed: Temp1

Standley, D.M. et al. (2005)

Page 52: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Real-Time Query Strategy

Query

Clustered Templates(5,097 representatives)

Nearest Cluster Standley, D.M. (2006)

Page 53: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Structure Navigator-RT could reply the query quickly, but ...

When we open the Web service of Structure Navigator-RT and accept many requests, our computer resource may not be enough.

We want to use large computing resource somewhere else.

We need a simple and flexible tool for this purpose. Grid Web Service

Page 54: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Grid Web Service using Opal

SDSC Technical Report TR-2006-5, 2006.

Opal is a toolkit for wrapping application programs as Web services, providing scheduling, Grid security, and data management.

Page 55: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

CCGrid 2006

Grid Web Service using Opal

Page 56: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Operation Providersfor WSRF

Operation Providersfor Notification

GetRPProvider

SetRPProvider

QueryRPProvider

SubscribeProvider

NotificationConsumerProvider

Globus Toolkit 4 (GT4)

Application Service (WSRF)

Opal OP

Opal

Opal OPToolkit

Opal Operation Providerby W.W. Li & K. Ichikawa

(under the PRIUS project)

Opal is deployed as one of Operation Providers of Globus Toolkit 4 (GT4): Opal Operation Provider (Opal-OP)

Page 57: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Computing Server (PC-cluster 16 nodes)

Head Node (1 PC node):

Tomcat

Opal Operation Provider Service

Pre-WS GRAM

GT4

OpenPBS (launch pbs_server, pbs_sched, pbs_mon)

Run the Application program

Job scheduling

15 Worker nodes:

openPBS (launch pbs_mon)

Run the Application program

Portal Server

Apache

Tomcat (jsp -Input page/servlet -Opal-OP client program)

Library of GT4

Library of Operation Provider

Library of Operation Provider Service

End userSystem Architecture

@IPR, Osaka Univ.(Suita)

@CMC, Osaka Univ.(Toyonaka)

Submit Job

Request Query

LaunchLaunch

Page 58: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Query: 3D fold (1crn)

Structure Navigator-RT with Opal-OP on GRID

Page 59: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 60: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 61: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 62: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 63: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 64: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October
Page 65: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Blue: Query StructureRed: Second nearest cluster to the Query

Query structure

Page 66: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Conclusion

We implemented the Opal Operation Provider to run Structure Navigator-RT as a Grid Web Service, using the local scheduler, openPBS, to run our application program on the PC cluster at the Cyber-Media Center, Osaka University, through the network inside Osaka University.

http://www.pdbj.org/

Page 67: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

・ Surface similarity

Identify Protein Function from

・ Sequence similarity ・ Fold similarity

Page 68: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Kinoshita, Furui, Nakamura (2002) J. Struct. Funct. Genomics 2, 9-22.Kinoshita, Nakamura (2004) Bioinformatics 20, 1329-1330.

eF-site database for Ligand recognition sites

www.pdbj.org /eF-site/

Page 69: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Folylpolyglutamate synthetase

query: MJ0226 free formsearch result against eF-site/ActiveSite + eF-site/mono-nucleotide (1684 entries)

Pyruvate kinase

Similarity was found here.

Function identification of hypothetical proteins Kinoshita & Nakamura (2003) Protein Science 12, 1589-1595. Kinoshita & Nakamura (2005) Protein Science, 14. 711-718.

Page 70: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

eF-seek: Query service for search of similar molecular surfaces (-version)

ef-site.hgc.jp /eF-seek/

Kinoshita, K. & Nakamura, H. (2006)

Page 71: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Application of Grid to Web Service of PDBj Query for the analog structural data: Protein folds and molecular surfaces  

Blue: QueryRed: Temp1

New Structure

eF-site Database

Quick response by using Grid

computing

Looking for the similar folds( Structure Navigator Real-Time serve

r )

Looking for the similar surfaces(eF-seek)

Fold Database

Query for molecular surface

Query for fold

Result

Result

Page 72: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

Portal/Client Program

Service A

ServiceC

ServiceB

User

GT GT XMLDB

Communicate using SOAP

Distributed Computingand DB System

Personal user

site-A site-B site-C

GRID computing serviceswith Opal-OP

Database query services with XQuery/XPath

Integration of computing systems and Database query systems at PDBj

Analog Query Search

Text Query Search

Page 73: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October

・ Reiko Yamashita, Daron M. Standley (BIRD-JST / Inst. Protein Res., Osaka Univ.)

・ Kohei Ichikawa, Susumu Date, Shinji Shimojo (Cyber Media Center, Osaka Univ.)

・ Wilfred W. Li , Peter Arzberger (UCSD)

・ Hiroyuki Toh (Medical Inst. Bioregulation, Kyushu Univ)

・ Kengo Kinoshita (Inst. Med. Science, Univ. Tokyo)

・ Other PDBj members (BIRD-JST / Inst. Protein Res., Osaka Univ.)

・ PRIUS

Acknowledgements

Page 74: Http:// Haruki Nakamura Institute for Protein Research, Lab. Protein Informatics, Osaka University PRAGMA 11 Workshop, Osaka Univ., on October