39
Since the completion of the Human Genome Project, high-throughput experimental projects have been initiated for uncovering genomic information in an extended sense, including transcriptomics, proteomics metabolomics, glycomics, and chemi- cal genomics. We are developing a new generation of databases and computa- tional technologies, beyond the traditional genome databases and sequence analy- sis tools, for making full use of these divergent and ever-increasing amounts of data, especially for medical and pharmaceutical applications. 1. KEGG DRUG and KEGG DISEASE Minoru Kanehisa KEGG is a database of biological systems that integrates genomic, chemical, and systemic func- tional information. It is widely used as a refer- ence knowledge base for understanding higher- order functions and utilities of the cell or the or- ganism from genomic information. Although the basic components of the KEGG resource are de- veloped in Kyoto University, this Laboratory in the Human Genome Center is responsible for the applied areas of KEGG, especially in medi- cal and pharmaceutical sciences. We develop KEGG DRUG (http://www.genome.jp/kegg/ drug/), a chemical structure based information resource for all approved drugs in the world, in- tegrating target information in the context of KEGG pathways, efficacy information in the context of hierarchical drug classifications, and natural product information in the context of plant and other genomes. We also develop KEGG DISEASE (http://www.genome.jp/kegg/ disease/), a new resource for understanding molecular mechanisms of human diseases, where molecular networks involving disease genes are represented as KEGG pathway maps or, when such details are not known, simply by a list of diseases genes and other lists of mole- cules including environmental factors. 2. KEGG OC: Automatic assignments of orthologs and paralogs in complete genomes Toshiaki Katayama, Shuichi Kawashima , Akihiro Nakaya and Minoru Kanehisa The increase in the number of complete genomes has provided clues to gain useful in- sights to understand the evolution of the gene Human Genome Center Laboratory of Genome Database Laboratory of Sequence Analysis ゲノムデータベース分野 シークエンスデータ情報処理分野 Professor Minoru Kanehisa, Ph.D. Assistant Professor Toshiaki Katayama, M.Sc. Assistant Professor Shuichi Kawashima, M.Sc. Lecturer Tetsuo Shibuya, Ph.D. Assistant Professor Michihiro Araki, Ph.D. 教授(委嘱)理学博士 理学修士 理学修士 理学博士 薬学博士 116

Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

Since the completion of the Human Genome Project high-throughput experimentalprojects have been initiated for uncovering genomic information in an extendedsense including transcriptomics proteomics metabolomics glycomics and chemi-cal genomics We are developing a new generation of databases and computa-tional technologies beyond the traditional genome databases and sequence analy-sis tools for making full use of these divergent and ever-increasing amounts ofdata especially for medical and pharmaceutical applications

1 KEGG DRUG and KEGG DISEASE

Minoru Kanehisa

KEGG is a database of biological systems thatintegrates genomic chemical and systemic func-tional information It is widely used as a refer-ence knowledge base for understanding higher-order functions and utilities of the cell or the or-ganism from genomic information Although thebasic components of the KEGG resource are de-veloped in Kyoto University this Laboratory inthe Human Genome Center is responsible forthe applied areas of KEGG especially in medi-cal and pharmaceutical sciences We developKEGG DRUG (httpwwwgenomejpkeggdrug) a chemical structure based informationresource for all approved drugs in the world in-tegrating target information in the context ofKEGG pathways efficacy information in thecontext of hierarchical drug classifications and

natural product information in the context ofplant and other genomes We also developKEGG DISEASE (httpwwwgenomejpkeggdisease) a new resource for understandingmolecular mechanisms of human diseaseswhere molecular networks involving diseasegenes are represented as KEGG pathway mapsor when such details are not known simply bya list of diseases genes and other lists of mole-cules including environmental factors

2 KEGG OC Automatic assignments oforthologs and paralogs in completegenomes

Toshiaki Katayama Shuichi KawashimaAkihiro Nakaya and Minoru Kanehisa

The increase in the number of completegenomes has provided clues to gain useful in-sights to understand the evolution of the gene

Human Genome Center

Laboratory of Genome DatabaseLaboratory of Sequence Analysisゲノムデータベース分野シークエンスデータ情報処理分野

Professor Minoru Kanehisa PhDAssistant Professor Toshiaki Katayama MScAssistant Professor Shuichi Kawashima MScLecturer Tetsuo Shibuya PhDAssistant Professor Michihiro Araki PhD

教授(委嘱) 理学博士 金 久 實助 教 理学修士 片 山 俊 明助 教 理学修士 川 島 秀 一講 師 理学博士 渋 谷 哲 朗助 教 薬学博士 荒 木 通 啓

116

universe Among the KEGG suites of databasesthe GENES database contains more than 42 mil-lion genes from over 1000 organisms as of Feb-ruary 2009 Sequence similarities among thesegenes are calculated by all-against-all SSEARCHcomparison and stored them in the SSDB data-base Based on those databases the ORTHOL-OGY database has been manually constructed tostore the relationships among the genes sharingthe same biological function However in thisstrategy only the well known functions can beused for annotation of newly added genes thusthe number of annotated genes is limited Toovercome this situation we developed a fullyautomated procedure to find candidate ortholo-gous clusters including whose current functionalannotation is anonymous The method is basedon a graph analysis of the SSDB database treat-ing genes as nodes and Smith-Waterman se-quence similarity scores as weight of edges Thecluster is found by our heuristic method forfinding quasi-cliques but the SSDB graph is toolarge to perform quasi-clique finding at a timeTherefore we introduce a hierarchy (evolution-ary relationship) of organisms and treat theSSDB graph as a nested graph The automaticdecomposition of the SSDB graph into a set ofquasi-cliques results in the KEGG OC (OrthologCluster) database We built a system that per-forms automatic update of the ortholog clusterwhich can be run weekly basis As a result weobtained 669043 clusters including 403752 sin-gleton clusters from 3721464 protein codinggenes Among them only 2309 clusters wereshared across kingdoms and other clusters werekingdom specific The automatic classification ofour ortholog clusters largely consistent with themanually curated ORTHOLOGY database Aweb interface to search and browse genes inclusters is made available at httpockeggjp

3 EGENES A database for expressed se-quence tag indices of plant species

Shuichi Kawashima Yuki Moriya ToshiakiTokimatsu Susumu Goto and MinoruKanehisa

EGENES is a knowledge-based database forefficient analysis of plant expressed sequencetags (ESTs) which was recently added to theKEGG PLANT It links plant genomic informa-tion with higher order functional information inKEGG The genomic information in EGENES isa collection of EST contigs constructed from as-sembled plant ESTs by using EGassembler TheEST indices are automatically annotated withthe KEGG Orthology identifiers (K numbers) byKEGG Automatics Annotation Server (KAAS)

Currently It contains 2452094 sequence cata-logues (779490 contigs and 1672604 singletons)in 62 plants 25 of the sequences are assignedK numbers EGENES is available at httpwwwgenomejpkeggplantpln_listhtml

4 KEGG API SOAPWSDL interface for theKEGG system

Shuichi Kawashima Toshiaki Katayama andMinoru Kanehisa

KEGG is a suite of databases and associatedsoftware integrating our current knowledge ofmolecular interactionreaction pathways andother systemic functions (PATHWAY and BRITEdatabases) the information about the genomicspace (GENES database) and information aboutthe chemical space (LIGAND and DRUG data-bases) To facilitate large-scale applications ofthe KEGG system programmatically we havebeen developing and maintaining the KEGGAPI as a stable SOAPWSDL based web serviceThe KEGG API is available at httpwwwgenomejpkeggsoap

5 KEGG DAS Comprehensive repository forcommunity genome annotation

Toshiaki Katayama Mari Watanabe andMinoru Kanehisa

KEGG DAS is an advanced genome databasesystem providing DAS (Distributed AnnotationSystem) service for all bacterial organisms in theGENOME database in KEGG Currently KEGGDAS contains over 8 million annotations as-signed to the genome sequences of 817 organ-isms (increased from 615 organisms in last year)The KEGG DAS server provides gene annota-tions linked to the KEGG PATHWAY andLIGAND databases In addition to the codinggenes information of non-coding RNAs pre-dicted using Rfam database is also provided tofill the annotation of the intergenic regions ofthe genomes The KEGG DAS service is avail-able at httpdashgcjp

6 Full-Arthropods Constructing full lengthcDNA of pathogenetic arthropods

Toshiaki Katayama Shuichi Kawashima MihoUsui Hiroyuki Wakaguri Eri KibukawaMasahide Sasaki Kazuhisa Hiranuka Ryuichiro Maeda Yutaka Suzuki SumioSugano and Junichi Watanabe

Anopheles mosquito tsetse fly tsutsugamushi-mite dust mite are arthropods which are

117

known as medically important because theseeither transmit various infectious disease includ-ing malaria Japanese river fever or cause al-lergy such as asthma and dermatitis Because ofserious medical problems they cause theirgenomes are being extensively analyzed re-cently We have produced libraries of the fourorganisms and are constructing their databasesfor the functional genome analysis Full-Arthropods is available on the site httpful-larthhgcjp

7 Full-Entamoeba a database for the fulllength cDNA library of Entamoeba

Toshiaki Katayama Kazushi Hiranuka Masahiro Kumagai Yutaka Suzuki SumioSugano Atsushi Toyoda Asao MakiokaJunichi Watanabe

Entamoeba histolytica is a protozoan parasitewhich predominantly infects humans and otherprimates and causes amebiasis E histolytica isestimated to infect about 50 million peopleworldwide and amebiasis is estimated to cause70000 deaths per year Full-Entamoeba a data-base for full-length cDNAs from a human para-site E histolytica and a reptilian parasite E in-vadens has been produced The full-lengthcDNA libraries were produced using the oligo-capping method from trophozoites of each spe-cies cultivated axenically A total of 5000 5rsquoend-one-pass sequences of cDNAs from the two spe-cies were compared with the non-redundant da-tabase of DDBLGenBankEMBL using BLASTand TBLASTX programs These clones are avail-able for further analysis and experiments Full-Entamoeba database is available at httpful-lenthgcjp

8 Analysis of sequence catalogs of thehouse dust mite Dermatophagoides fari-nae

Shuichi Kawashima Atsushi Toyoda JunichiWatanabe Sadao Nogami and MinoruKanehisa

The house dust mite is a cosmopolitan guestin human habitation and the multicellular or-ganism that is one of the most closely associatedwith our life It is now well established that thedust mites are major allergens causing bronchialasthma allergic rhinitis and atopic dermatitisDermatophagoides farinae (American house dustmite) and D pteronissinus (European house dustmite) are two most common species in the tem-perate zone We produced the cDNA libraries ofour D farinae sample containing young nymphs

and adults using the vector trapper method andsequenced the both ends of 11520 cDNA clonesCleaning clustering and assembling of the rowsequences produced 3031 contigs and 4281 sin-gletons 1797 of the total unique 7312 se-quences were assigned KEGG Orthology byKAAS system More than 30 of the sequencesshowed significant matches to KEGG GENESdatabase which includes well characterized Derf group 1 allergens We predicted 1109 peptideslonger than 20 amino acids from the 3031 con-tigs Some of the peptides are predicted to con-tain the 9-mer peptides with strong affinities toMHC class II by NetMHC 30 We expect thatthese in silico analyses will pave the way towardprediction of allergens from D farinae

9 HiGet and SSS Search engines for thelarge-scale biological databases

Toshiaki Katayama Shuichi KawashimaKazuhiro Ohi Kenta Nakai and MinoruKanehisa

Recently the number of entries in biologicaldatabases is exponentially increasing year byyear For example there were 10106023 entriesin the GenBank database in the year 2000 whichhas now grown to 98868465 (Release 169+daily updates) In order for such a vast amountof data to be searched at a high speed we havedeveloped a high performance database entryretrieval system named HiGet For this purposethe system is constructed on the HiRDB a com-mercial ORDBMS (Object-oriented RelationalDatabase Management System) developed byHitachi Ltd HiGet can perform full text searchon various biological databases including Gen-Bank RefSeq UniProt Prosite OMIM and PDBAdditional advantage of the HiGet system is thecapability of a field specific search which en-ables users to narrow down the number of re-sults especially useful for collecting sequencesof their specific needs We have also developeda sequence similarity search (SSS) service to findhomologous sequences with various algorithmsincluding BLAST FASTA SSEARCH TRANSand EXONERATE This variety of options isunique among the public services and users canselect an appropriate method to search similarsequences according to their query Because al-gorithms such as TRANS and EXONERATE arehighly time consuming the SSS service is back-ended by the distributed computing environ-ment with the Sun Grid Engine in our supercomputer system HiGet and SSS services areavailable at httphigethgcjp and httpssshgcjp respectively

118

10 Linear-time protein 3-D structure search-ing algorithm

Tetsuo Shibuya

Finding similar structures from 3-D structuredatabases of proteins (or other molecules) is be-coming one of the most important issues in thepost-genomic molecular biology To compare 3-D structures of two molecules biologists mostlyuse the RMSD (root mean square deviation) asthe similarity measure The RMSD is one of themost fundamental similarity measures used invarious fields such as computer vision and ro-botics for comparing two sets of coordinates Inthis research we propose new theoretically andpractically fast algorithms for the basic problemof finding all the substructures of structures in astructure database of chain molecules (such asproteins) whose RMSDs to the query are withina given constant threshold The best-knownworst-case time complexity for the problem is O(N log m) where N is the database size and mis the query size The previous best-known ex-pected time complexity for the problem is alsoO (N log m) In this research we propose a newbreakthrough linear-expected-time algorithm Itis not only a theoretically significant improve-ment over previous algorithms but also a prac-tically faster algorithm according to computa-tional experiments We also propose a series ofpreprocessing algorithms that enable faster que-ries though there have been no known indexingalgorithm whose query time complexity is betterthan the above O (N log m) bound One is an O(N log2 N )-time and O (N log N )-space pre-processing algorithm with expected query timecomplexity of O (m+N m 05) Another is an O (Nlog N )-time and O (N )-space preprocessing algo-rithm with expected query time complexity of O(N m 05+m log (N m))

We also extend the above linear-time algo-rithm into an algorithm with expected querytime complexity of O (m+N m 1-ε) where ε isan arbitrary small constant such that 0<ε<1We furthermore extend the above linear-time al-gorithm so that it can deal with insertions anddeletions

We checked the performance of our linear-expected-time algorithm through computationalexperiments over the whole PDB database Theexperiments show that our algorithm is muchfaster than the previous algorithms For exam-ple our algorithm is 36 to 28 times faster thanpreviously known algorithms to search for simi-lar substructures whose RMSDs are within 1Åto queries of ordinary lengths The experimentsalso show that there is consistency between theabove theoretical results and the experimental

results In other words the actual computationtime of our linear-expected-time algorithm is notinfluenced by the difference of query lengths incontrast to previous algorithms

11 Fast hinge detection algorithm in proteinstructures

Tetsuo Shibuya

Analysis of conformational changes is one ofthe keys to the understanding of protein func-tions and interactions For the analysis we oftencompare two protein structures taking flexibleregions like hinge domains into considerationThe RMSD (Root Mean Square Deviation) is themost popular measure for comparing two pro-tein structures but it is only for rigid structureswithout hinge domains In this research we pro-pose a new measure called RMSDh (Root MeanSquare Deviation considering hinges) and itsvariant RMSDh(k) for comparing two flexibleproteins with hinge domains We also proposenovel efficient algorithms for computing themwhich can detect the hinge positions at the sametime The RMSDh is suitable for cases wherethere is one small hinge domain in each of thetwo target structures The new algorithm forcomputing the RMSDh runs in linear timewhich is same as the time complexity for com-puting the RMSD and is faster than any of pre-vious algorithms for hinge detection TheRMSDh(k) is designed for comparing structureswith more than one hinge domain The RMSDh(k) measure considers at most k small hinge do-mains ie the RMSDh(k) value should be smallif the two structures are similar except for atmost k hinge domains To compute the valuewe propose an O (kn 2)-time and O (n)-space al-gorithm based on a new dynamic programmingtechnique We also test our measures againstboth flexible protein structures and non-flexibleprotein structures and show that the hinge po-sitions can be correctly detected by our algo-rithms

12 Fast flexible protein structure alignment

Kohichi Suematsu and Tetsuo Shibuya

The Hinge Detection Algorithm described insection 11 only considered rigid hinge pointsbut the hinges are sometimes bends a little by it-self which sometimes leads to inaccurate pre-diction of hinge positions Thus we incorporatedthe notion lsquobending hingersquo to detect such hingepositions We developed a very efficient heuris-tic algorithm for finding such bending hinges asthe exact algorithm for this problem requires ex-

119

ponential time For the algorithm we developeda detailed score matrix for comparing localstructures based on the naiumlve Bayse learning

13 Protein function prediction based on 3-Dstructure motifs

Chia-Han Chu Hiroki Sakai and TetsuoShibuya

Protein functions are said to be determined byits 3-D structures but not all functions havebeen known to be related to some 3-D structuremotifs The geometric suffix tree a data struc-ture for indexing 3-D protein structures whichis also developed by us enables comprehen-sively enumeration of all the possible structuralmotifs among given set of proteins We are de-veloping a new algorithm based on the supportvector machine that decides proteinrsquos functionfrom the 3-D structure of a protein This algo-rithm utilizes all the possible 3-D motifs foundby using the geometric suffix tree

14 Suffix array construction with a lazyscheme

Ben Hachimori and Tetsuo Shibuya

The suffix array is one of the most importantindexing data structures for alphabet strings in-cluding DNA sequences RNA sequences pro-tein sequences web pages Medline databaseand so on But even the most sophisticated algo-rithm for constructing the suffix array requires alot of time We developed a new efficient lazyalgorithm that computes the suffix array onlyafter we get the query By doing so we have tocompute only the necessary part of the suffix ar-ray We developed a lazy algorithm based onthe Schurmann-Stoye algorithm which is moreefficient than both Boyer-Moore algorithm andother suffix tree-based algorithms in case thenumber of queries is limited

15 Color space-DNA sequence mappingalignment algorithm

Ben Hachimori and Tetsuo Shibuya

Applied Biosystemsrsquos SOLiD system encodethe DNA sequence into a sequence of data typecalled the color space where one of 4 fluores-cent colors is assigned to each two adjacentbasersquos 16 pattern orderings However therehave been known no algorithm that alignsmaps the color-space sequence to the DNA se-quence with consideration of the difference be-tween the experimental error and the actual mu-tation We developed an alignment algorithmthat distinguishes the experimental error and ac-tual DNA mutation to align the color-space dataagainst ordinary DNA sequences Moreover wecomputed the optimal score table for the align-ment based on the actual E coli data

16 Genotype clustering based on hiddenMarkov models

Ritsuko Onuki Tetsuo Shibuya and MinoruKanehisa

Haplotype clustering is important for genemapping of human disease Although its impor-tance for the analysis it is difficult to obtainhaplotype data from present experiment for itscost and error rate Instead of haplotypes geno-types are much easier to obtain In this workwe propose a new method for clustering geno-types In the algorithm we first infer the multi-ple haplotype candidates from the genotypeand next we calculate the distance between thegenotypes based on the results of the haplotypeinference Then we perform genotype clusteringbased on the distances We evaluated our algo-rithm by applying our algorithm against severalactual genotype data

Publications

Kanehisa M Araki M Goto S Hattori MHirakawa M Itoh M Katayama TKawashima S Okuda S Tokimatsu T andYamanishi Y KEGG for linking genomes tolife and the environment Nucleic Acids Re-search 36 D480-D484 2008

Kawashima S Pokarowski P Pokarowska MKolinski A Katayama T Kanehisa MAAindex amino acid index database progressreport 2008 Nucleic Acids Research 36 D202-D205 2008

Okuda S Yamada T Hamajima M Itoh MKatayama T Bork P Goto S and KanehisaM KEGG Atlas mapping for global analysisof metabolic pathways Nucleic Acids Research36 W423-426 2008

Wakaguri H Suzuki Y Katayama TKawashima S Kibukawa E Hiranuka KSasaki M Sugano S and Watanabe J Full-MalariaParasites and Full-Arthropods data-base of full-length cDNAs of parasites and ar-thropods update 2009 Nucleic Acids Research

120

37 D520-D525 2008Yamanishi Y Araki M Gutteridge A Honda

W and Kanehisa M Prediction of drug-targetinteraction networks from the integration ofchemical and genomic spaces Bioinformatics24 i232-i240 2008

Takarabe M Okuda S Itoh M Tokimatsu TGoto S and Kanehisa M Network analysisof adverse drug interactions Genome Informat-ics 20 252-259 2008

Hashimoto K Yoshizawa AC Okuda SKuma K Goto S and Kanehisa M The rep-ertoire of desaturases and elongases revealsfatty acid variations in 56 eukaryotic genomesJ Lipid Res 49 183-191 (2008)

Shibuya T Fast Hinge Detection Algorithmsfor Flexible Protein Structures IEEEACM

Transactions on Computational Biology and Bioin-formatics to appear

Shibuya T Searching Protein 3-D Structures inLinear Time Proc 13th Annual InternationalConference on Research in Computational Molecu-lar Biology (RECOMB 2009) 2009 to appear

Shibuya T Linear-Time Algorithm for Search-ing Protein 3-D Structures IPSJ SIG Notes SI-GAL 123-4 2009 to appear

Suematsu K Shibuya T Flexible ProteinAlignment of 3D-Structures Allowing Dy-namic Transformation ISPSJ SIG Notes SIG-BIO 12-12 2008 pp 87-94本多渉田辺麻央矢野亜津子金久實バイオインフォマティクスシステムバイオロジーとKEGG生化学801094―11112008

121

The recent advances in biomedical research have been producing large-scaleultra-high dimensional ultra-heterogeneous data Due to these post-genomic re-search progresses our current mission is to create computational strategy for sys-tems biology and medicine towards translational bioinformatics With this missionwe have been developing computational methods for understanding life as systemand applying them to practical issues in medicine and biology

1 Computational Systems Biology

a Systematic reconstruction of TRANSPATHdata into Cell System Markup Language

Masao Nagasaki Ayumu Saito Chen Li EunaJeong Satoru Miyano

Many biological repositories store informationbased on experimental study of the biologicalprocesses within a cell such as protein-proteininteractions metabolic pathways signal trans-duction pathways or regulations of transcrip-tion factors and miRNA Unfortunately it is dif-ficult to directly use such information whengenerating simulation-based models Thus mod-eling rules for encoding biological knowledgeinto system-dynamics-oriented standardized for-mats would be very useful for full understand-ing of cellular dynamics at the system level Weselected the TRANSPATH database a manuallycurated high-quality pathway database whichprovides a rich source of cellular events in hu-mans mice and rats curated from over 31500papers In this work we defined 16 modeling

rules based on hybrid functional Petri net withextension (HFPNe) which is suitable for graphi-cal representation and simulation of biologicalprocesses In these modeling rules each Petrinet element is incorporated with Cell SystemOntology (CSO) to enable semantic interoper-ability of models As a formal ontology for bio-logical pathway modeling with dynamics CSOalso defines biological terminology and corre-sponding icons By combining HFPNe with theCSO features we made a method for transforming TRANSPATH data to simulation-based se-mantically valid models The results are en-coded into a biological pathway format CellSystem Markup Language (CSML) which easesthe exchange and integration of biological dataand models By using the 16 modeling rules97 of the reactions in TRANSPATH are con-verted into simulation-based models representedin CSML This reconstruction demonstrated thatit is possible to use our rules to generate quanti-tative models from static pathway descriptions

b Finding optimal Bayesian network given asuper-structure

Human Genome Center

Laboratory of DNA Information AnalysisDNA情報解析分野

Professor Satoru Miyano PhDAssociate Professor Seiya Imoto PhDAssistant Professor Masao Nagasaki PhDProject Lecturer Rui Yamaguchi PhDProject AssistantProfessor Yoshinori Tamada PhD

教 授 理学博士 宮 野 悟准教授 博士(数理学) 井 元 清 哉助 教 博士(理学) 長 正 朗特任講師 博士(理学) 山 口 類特任助教 博士(情報学) 玉 田 嘉 紀

122

Eric Perrier Seiya Imoto Satoru Miyano

Conventional approaches for learning Baye-sian network structure from data have disad-vantages in terms of complexity and lower accu-racy of their results However a recent empiri-cal study has shown that a hybrid algorithm im-proves sensitively accuracy and speed it learnsa skeleton with an independency test (IT) ap-proach and constrains on the directed acyclicgraphs considered during the search-and-scorephase Subsequently we defined the structuralconstraint by introducing the concept of super-structure S which is an undirected graph thatrestricts the search to networks whose skeletonis a subgraph of S We developed a super-structure constrained optimal search (COS) itstime complexity is upper bounded by O(γm

n)where γm<2 depends on the maximal degree mof S Empirically complexity depends on theaverage degree mrsquo and sparse structures allowlarger graphs to be calculated Our algorithm isfaster than an optimal search by several ordersand even finds more accurate results whengiven a sound super-structure Practically S canbe approximated by IT approaches significancelevel of the tests controls its sparseness enablingto control the trade-off between speed and accu-racy For incomplete super-structures a greedilypost-processed version (COS+) still enables tosignificantly outperform other heuristic searches

c Statistical inference of transcriptionalmodule-based gene networks from timecourse gene expression profiles by usingstate space models

Osamu Hirose Ryo Yoshida1 Seiya Imoto RuiYamaguchi Tomoyuki Higuchi1 D StephenCharnock-Jones2 Cristin Print3 Satoru Miy-ano 1Institute of Statistical Mathematics 2Cambridge University 3University of Auck-land

We developed a novel method based on thestate space model to identify the transcriptionalmodules and module-based gene networks si-multaneously The state space model has the po-tential to infer large-scale gene networks eg oforder 103 from time-course gene expression pro-files Particularly we succeeded in identificationof a cell cycle system by using the gene expres-sion profiles of Saccharomyces cerevisiae in whichthe length of the time-course and number ofgenes were 24 and 4382 respectively Howeverwhen analyzing shorter time-course data eg oflength 10 or less the parameter estimations ofthe state space model often fail due to overfit-ting To extend the applicability of the state

space model we provided an approach to usethe technical replicates of gene expression pro-files which are often measured in duplicate ortriplicate The use of technical replicates is im-portant for achieving highly-efficient inferenceof gene networks with short time-course dataThe potential of the proposed method weredemonstrated through the time-course analysisof the gene expression profiles of human umbili-cal vein endothelial cells undergoing growthfactor deprivation-induced apoptosis

d Predicting differences in gene regulatorysystems by state space models

Rui Yamaguchi Seiya Imoto Mai YamauchiMasao Nagasaki Ryo Yoshida1 Teppei Shima-mura Yosuke Hatanaka Kazuko Ueno To-moyuki Higuchi1 Noriko Gotoh Satoru Miy-ano

We developed a statistical method to predictdifferentially regulated genes of case and controlsamples from time-course gene expression databy leveraging unpredictability of the expressionpatterns from the underlying regulatory systeminferred by a state space model The proposedmethod can screen out genes that show differentpatterns but generated by the same regulationsin both samples since these patterns can be pre-dicted by the same model Our strategy consistsof three steps Firstly a gene regulatory systemis inferred from the control data by a state spacemodel Then the obtained model for the under-lying regulatory system of the control sample isused to predict the case data Finally by assess-ing the significance of the difference betweencase and predicted-case time-course data of eachgene we are able to detect the unpredictablegenes that are the candidate as the key differ-ences between the regulatory systems of caseand control cells We illustrate the whole proc-ess of the strategy by an actual example wherehuman small airway epithelial cell gene regula-tory systems were generated from novel timecourses of gene expressions following treatmentwith(case)without(control) the drug gefitiniban inhibitor for the epidermal growth factor re-ceptor tyrosine kinase Finally in gefitinib re-sponse data we succeeded in finding unpredict-able genes that are candidates of the specific tar-gets of gefitinib We also discussed differencesin regulatory systems for the unpredictablegenes The proposed method would be a prom-ising tool for identifying biomarkers and drugtarget genes

e Bayesian learning of biological pathwayson genomic data assimilation

123

Ryo Yoshida1 Masao Nagasaki Rui Yama-guchi Seiya Imoto Satoru Miyano TomoyukiHiguchi1

Mathematical modeling and simulation basedon biochemical rate equations provide us a rig-orous tool for unraveling complex mechanismsof biological pathways To proceed to simulationexperiments it is an essential first step to findeffective values of model parameters which aredifficult to measure from in vivo and in vitro ex-periments Furthermore once a set of hypotheti-cal models has been created any statistical crite-rion is needed to test the ability of the con-structed models and to proceed to model revi-sion We developed a new statistical technologytowards data-driven construction of in silico bio-logical pathways The method starts with aknowledge-based modeling with hybrid func-tional Petri net It then proceeds to the Bayesianlearning of model parameters for which experi-mental data are available This process exploitsquantitative measurements of evolving bio-chemical reactions eg gene expression dataAnother important issue that we consider is sta-tistical evaluation and comparison of the con-structed hypothetical pathways For this pur-pose we have developed a new Bayesianinformation-theoretic measure that assesses thepredictability and the biological robustness of insilico pathways

f Modeling nonlinear gene regulatory net-works from time series gene expressiondata

Andreacute Fujita Joatildeo Ricardo Sato5 HumbertoMiguel Garay-Malpartida5 Mari CleideSogayar5 Carlow Eduardo Ferreira5 SatoruMiyano 5University of Satildeo Paulo

In cells molecular networks such as generegulatory networks are the basis of biologicalcomplexity Therefore gene regulatory networkshave become the core of research in systems bi-ology Understanding the processes underlyingthe several extracellular regulators signal trans-duction protein-protein interactions and differ-ential gene expression processes requires de-tailed molecular description of the protein andgene networks involved To understand betterthese complex molecular networks and to infernew regulatory associations we developed astatistical method based on vector autoregres-sive models and Granger causality to estimatenonlinear gene regulatory networks from timeseries microarray data Most of the modelsavailable in the literature assume linearity in theinference of gene connections moreover these

models do not infer directionality in these con-nections Thus a priori biological knowledge isrequired However in pathological cases no apriori biological information is available Toovercome these problems we present the non-linear vector autoregressive (NVAR) model Wehave applied the NVAR model to estimate non-linear gene regulatory networks based entirelyon gene expression profiles obtained from DNAmicroarray experiments We showed the resultsobtained by NVAR through several simulationsand by the construction of three actual generegulatory networks (p53 NF-κB and c-Myc)for HeLa cells

g Fast grid layout algorithm for biologicalnetworks with sweep calculation

Kaname Kojima Masao Nagasaki Satoru Miy-ano

Properly drawn biological networks are ofgreat help in the comprehension of their charac-teristics The quality of the layouts for retrievedbiological networks is critical for pathway data-bases However since it is unrealistic to manu-ally draw biological networks for every re-trieval automatic drawing algorithms are essen-tial Grid layout algorithms handle various bio-logical properties such as aligning vertices hav-ing the same attributes and complicated posi-tional constraints according to their subcellularlocalizations thus they succeed in providingbiologically comprehensible layouts Howeverexisting grid layout algorithms are not suitablefor real-time drawing which is one of requisitesfor applications to pathway databases due totheir high-computational cost In addition theydo not consider edge directions and their result-ing layouts lack traceability for biochemical re-actions and gene regulations which are themost important features in biological networksWe devised a new calculation method termedsweep calculation and reduced the time com-plexity of the current grid layout algorithmsthrough its encoding and decoding processesWe conduct ed practical experiments by using95 pathway models of various sizes fromTRANSPATH and showed that our new gridlayout algorithm is much faster than existinggrid layout algorithms For the cost function weintroduced a new component that penalizes un-desirable edge directions to avoid the lack oftraceability in pathways due to the differencesin direction between in-edges and out-edges ofeach vertex

124

h Estimation of nonlinear gene regulatorynetworks via L1 regularized NVAR fromtime series gene expression data

Kaname Kojima Andreacute Fujita Teppei Shima-mura Seiya Imoto Satoru Miyano

Recently nonlinear vector autoregressive(NVAR) model based on Granger causality wasproposed to infer nonlinear gene regulatory net-works from time series gene expression dataSince NVAR requires a large number of parame-ters due to the basis expansion the length oftime series microarray data is insufficient for ac-curate parameter estimation and we need tolimit the size of the gene set strongly To ad-dress this limitation we employed L1 regulariza-tion technique to estimate NVAR Under L1

regularization direct parents of each gene canbe selected efficiently even when the number ofparameters exceeds the number of data samplesWe can thus estimate larger gene regulatory net-works more accurately than those from existingmethods Through the simulation study weverified the effectiveness of the proposedmethod by comparing its limitation in the num-ber of genes to that of the existing NVAR Theproposed method was also applied to time se-ries microarray data of Human hela cell cycle

i Multivariate gene expression analysis re-veals functional connectivity changes be-tween normaltumoral prostates

Andreacute Fujita Luciana Rodrigues Gomes5 JoatildeoRicardo Sato6 Rui Yamaguchi Carlos Edu-ardo Thomaz7 Mari Cleide Sogayar5 SatoruMiyano 6Universidade Federal do ABC 7Cen-tro Universitaacuterio da FEI

Principal Component Analysis (PCA) com-bined with the Maximum-entropy Linear Dis-criminant Analysis (MLDA) was applied in or-der to identify genes with the most discrimina-tive information between normal and tumoralprostatic tissues Data analysis was carried outusing three different approaches namely (i) dif-ferences in gene expression levels between nor-mal and tumoral conditions from a univariatepoint of view (ii) in a multivariate fashion usingMLDA and (iii) with a dependence network ap-proach Our results show that malignant trans-formation in the prostatic tissue is more relatedto functional connectivity changes in their de-pendence networks than to differential gene ex-pression The MYLK KLK2 KLK3 HAN11LTF CSRP1 and TGM4 genes presented signifi-cant changes in their functional connectivity be-tween normal and tumoral conditions and were

also classified as the top seven most informativegenes for the prostate cancer genesis process byour discriminant analysis Moreover among theidentified genes we found classically knownbiomarkers and genes which are closely relatedto tumoral prostate such as KLK3 and KLK2and several other potential ones We have dem-onstrated that changes in functional connectivitymay be implicit in the biological process whichrenders some genes more informative to dis-criminate between normal and tumoral condi-tions Using the proposed method namelyMLDA in order to analyze the multivariatecharacteristic of genes it was possible to capturethe changes in dependence networks which arerelated to cell transformation

j Rule-based reasoning for system dynam-ics in cell systems

Euna Jeong Masao Nagasaki Satoru Miyano

A system-dynamics-centered ontology calledthe Cell System Ontology (CSO) has been de-veloped for representation of diverse biologicalpathways Many of the pathway data based onthe ontology have been created from databasesvia data conversion or curated by expert biolo-gists It is essential to validate the pathway datawhich may cause unexpected issues such as se-mantic inconsistency and incompleteness Thispaper discusses three criteria for validating thepathway data based on CSO as follows (1)structurally correct models in terms of Petrinets (2) biologically correct models to capturebiological meaning and (3) systematically cor-rect models to reflect biological behaviors Si-multaneously we have investigated how logic-based rules can be used for the ontology to ex-tend its expressiveness and to complement theontology by reasoning which aims at qualifyingpathway knowledge Finally we show how theproposed approach helps exploring dynamicmodeling and simulation tasks without priorknowledge

k A novel strategy to search conserved tran-scription factor binding sites among coex-pressing genes in human

Yosuke Hatanaka Masao Nagasaki Rui Yam-aguchi Takeshi Obayashi Kazuyuki NumataAndreacute Fujita Teppei Shimamura YoshinoriTamada Seiya Imoto Kengo Kinoshita KentaNakai Satoru Miyano

We reported various transcription factor bind-ing sites (TFBSs) conserved among co-expressedgenes in human promoter region using expres-

125

sion and genomic data Assuming similar pro-moter structure induces similar transcriptionalregulation hence induces similar expressionprofile we compared the promoter structuresimilarities between co-expressed genes Com-prehensive TF binding site predictions for allhuman genes were conducted for 19777 pro-moter regions around the transcription start site(TSS) given from DBTSS and promoter similar-ity search were conducted among coexpressinggenes data provided from newly developedCOXPRESdb Combination of Position WeightMatrix (PWM) motif prediction and bootstrapmethod 7313 genes have at least one statisti-cally significant conserved TFBS We also ap-plied basket method analysis for seeking combi-natorial activities of those conserved TFBSs

l Simulation analysis for the effect of light-dark cycle on the entrainment in circadianrhythm

Natumi Mitou8 Yuto Ikegami8 Hiroshi Mat-suno8 Satoru Miyano Shin-ichi T Inouye88Yamaguchi University

Circadian rhythms of the living organisms are24hr oscillations found in behavior biochemistryand physiology Under constant conditions therhythms continue with their intrinsic periodlength which are rarely exact 24hr In this pa-per we examine the effects of light on the phaseof the gene expression rhythms derived fromthe interacting feedback network of a few clockgenes taking advantage of a computer simula-tion with Cell Illustrator The simulation resultssuggested that the interacting circadian feedbacknetwork at the molecular level is essential forphase dependence of the light effects observedin mammalian behavior Furthermore the simu-lation reproduced the biological observationsthat the range of entrainment to shorter orlonger than 24hr light-dark cycles is limitedcentering around 24hr Application of our modelto inter-time zone flight successfully demon-strated that 6 to 7 days are required to recoverfrom jet lag when traveling from Tokyo to NewYork

2 Statistical and Computational KnowledgeDiscovery

a Nonlinear regression modeling via regular-ized radial basis function networks

Tomohiro Ando9 Sadanori Konishi10 SeiyaImoto 9Keio University 10Kyushu University

The problem of constructing nonlinear regres-

sion models is investigated to analyze data withcomplex structure We introduced radial basisfunctions with hyperparameter that adjusts theamount of overlapping basis functions andadopts the information of the input and re-sponse variables By using the radial basis func-tions we constructed nonlinear regression mod-els with help of the technique of regularizationCrucial issues in the model building process arethe choices of a hyperparameter the number ofbasis functions and a smoothing parameter Wepresent information-theoretic criteria for evaluat-ing statistical models under model misspecifica-tion both for distributional and structural as-sumptions We used real data examples andMonte Carlo simulations to investigate the prop-erties of the proposed nonlinear regression mod-eling techniques The simulation results showedthat our nonlinear modeling performs well invarious situations and clear improvements wereobtained for the use of the hyperparameter inthe basis functions

b The GC and window-averaged DNA curva-ture profile of secondary metabolite genecluster in Aspergillus fumigatus genome

Jin Hwan Do Satoru Miyano

An immense variety of complex secondarymetabolites is produced by filamentous fungi in-cluding Aspergillus fumigatus a main inducer ofinvasive aspergillosis The identification of fun-gal secondary metabolite gene cluster is essen-tial for the characterization of fungal secondarymetabolism in terms of genetics and biochemis-try through recombinant technologies such asgene disruption and cloning Most of the predic-tion methods for secondary metabolite genecluster severely depend on homology searchesHowever homology-based approach has intrin-sic limitation to unknown or novel gene clusterWe analyzed the GC and window-averagedDNA curvature profile of 26 secondary metabo-lite gene clusters in the A fumigatus genome tofind out potential conserved features of secon-dary metabolite gene cluster Fifteen secondarymetabolite gene clusters showed a conservedpattern in window-averaged DNA curvatureprofile that is the DNA regions including sec-ondary metabolic signature genes such aspolyketide synthase nonribosomal peptide syn-thase andor dimethylallyl tryptophan synthaseconsisted of window-averaged DNA curvaturevalues lower than 018 and these DNA regionswere at least 20 kb Forty percent of secondarymetabolite gene clusters with this conserved pat-tern were related to severe regulation by a tran-scription factor LaeA Our result could be used

126

for identification of other fungal secondary me-tabolite gene clusters especially for secondarymetabolite gene cluster that is severely regulatedby LaeA or other proteins with similar functionto LaeA

c ExonMiner Web service for analysis ofGeneChip exon array data

Kazuyuki Numata Ryo Yoshida1 Masao Na-gasaki Ayumu Saito Seiya Imoto Satoru Miy-ano

Some splicing isoform-specific transcriptionalregulations are related to disease Therefore de-tection of disease specific splice variations is thefirst step for finding disease specific transcrip-tional regulations Affymetrix Human Exon 10ST Array can measure exon-level expressionprofiles that are suitable to find differentially ex-pressed exons in genome-wide scale Howeverexon array produces massive datasets that aremore than we can handle and analyze on per-sonal computer We have developed ExonMiner

that is the first all-in-one web service for analy-sis of exon array data to detect transcripts thathave significantly different splicing patterns intwo cells eg normal and cancer cells Exon-Miner can perform the following analyses (1)data normalization (2) statistical analysis basedon two-way ANOVA (3) finding transcriptswith significantly different splice patterns (4) ef-ficient visualization based on heatmaps and bar-plots and (5) meta-analysis to detect exon levelbiomarkers We implemented ExonMiner on thesupercomputer system of Human Genome Cen-ter in order to perform genome-wide analysisfor more than 300000 transcripts in exon arraydata which has the potential to reveal the aber-rant splice variations in cancer cells as exonlevel biomarkers ExonMiner is well suited foranalysis of exon array data and does not requireany installation of software except for internetbrowsers The URL of ExonMiner is httpaehgcjpexonminer Users can analyze full datasetof exon array data within hours by high-levelstatistical analysis with sound theoretical basisthat finds aberrant splice variants as biomarkers

Publications

1 Ando T Konishi S Imoto S Nonlinear re-gression modeling via regularized radial ba-sis function networks Journal of StatisticalPlanning and Inference 138 (11) 3616-36332008

2 Brazma A Miyano S Akutsu T Proceed-ings of the 6th Asia-Pacific BioinformaticsConference (APBC 2008) Imperial CollegePress 2008

3 Do JH Miyano S The GC and window-averaged DNA curvature profile of secon-dary metabolite gene cluster in Aspergillusfumigatus genome Applied Microbiologyand Biotechnology 80 (5) 841-847 2008

4 Fujita A Gomes LR Sato JR Yama-guchi R Thomaz CE Sogayar MC Miy-ano S Multivariate gene expression analysisreveals functional connectivity changes be-tween normaltumoral prostates BMC Sys-tems Biology 2 106 2008

5 Fujita A Sato JR Garay-Malpartida HM Sogayar MC Ferreira CE Miyano SModeling nonlinear gene regulatory net-works from time series gene expressiondata J Bioinformatics and ComputationalBiology 6 (5) 961-979 2008

6 Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Fujita A Shima-mura T Tamada Y Imoto S KinoshitaK Nakai K Miyano S A novel strategy tosearch conserved transcription factor bind-

ing sites among coexpressing genes in hu-man Genome Informatics 20 212-221 2008

7 Hirose O Yoshida R Imoto S Yama-guchi R Higuchi T Charnock-Jones DSPrint C Miyano S Statistical inference oftranscriptional module-based gene networksfrom time course gene expression profiles byusing state space models Bioinformatics 24(7) 932-942 2008

8 Hirose O Yoshida R Yamaguchi RImoto S Higuchi T Miyano S Analyzingtime course gene expression data with bio-logical and technical replicates to estimategene networks by state space models Proc2nd Asia International Conference on Mod-elling amp Simulation 940-946 2008 (AMS2008 Refereed conference)

9 Jeong E Nagasaki M Miyano S Rule-based reasoning for system dynamics in cellsystems Genome Informatics 20 25-362008

10 Kitakaze H Kanda M Nakatsuka HIkeda N Matsuno H Miyano S Predic-tion of fragile points for robustness checkingof cell systems IEICE TRANSACTIONS onInformation and Systems D J91-D (9) 2404-2417 2008

11 Knapp E-W Benson G Holzhutter H-GKanehisa M Miyano S (Eds) Genome In-formatics 20 2008

12 Kojima K Fujita A Shimamura T Imoto

127

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 2: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

universe Among the KEGG suites of databasesthe GENES database contains more than 42 mil-lion genes from over 1000 organisms as of Feb-ruary 2009 Sequence similarities among thesegenes are calculated by all-against-all SSEARCHcomparison and stored them in the SSDB data-base Based on those databases the ORTHOL-OGY database has been manually constructed tostore the relationships among the genes sharingthe same biological function However in thisstrategy only the well known functions can beused for annotation of newly added genes thusthe number of annotated genes is limited Toovercome this situation we developed a fullyautomated procedure to find candidate ortholo-gous clusters including whose current functionalannotation is anonymous The method is basedon a graph analysis of the SSDB database treat-ing genes as nodes and Smith-Waterman se-quence similarity scores as weight of edges Thecluster is found by our heuristic method forfinding quasi-cliques but the SSDB graph is toolarge to perform quasi-clique finding at a timeTherefore we introduce a hierarchy (evolution-ary relationship) of organisms and treat theSSDB graph as a nested graph The automaticdecomposition of the SSDB graph into a set ofquasi-cliques results in the KEGG OC (OrthologCluster) database We built a system that per-forms automatic update of the ortholog clusterwhich can be run weekly basis As a result weobtained 669043 clusters including 403752 sin-gleton clusters from 3721464 protein codinggenes Among them only 2309 clusters wereshared across kingdoms and other clusters werekingdom specific The automatic classification ofour ortholog clusters largely consistent with themanually curated ORTHOLOGY database Aweb interface to search and browse genes inclusters is made available at httpockeggjp

3 EGENES A database for expressed se-quence tag indices of plant species

Shuichi Kawashima Yuki Moriya ToshiakiTokimatsu Susumu Goto and MinoruKanehisa

EGENES is a knowledge-based database forefficient analysis of plant expressed sequencetags (ESTs) which was recently added to theKEGG PLANT It links plant genomic informa-tion with higher order functional information inKEGG The genomic information in EGENES isa collection of EST contigs constructed from as-sembled plant ESTs by using EGassembler TheEST indices are automatically annotated withthe KEGG Orthology identifiers (K numbers) byKEGG Automatics Annotation Server (KAAS)

Currently It contains 2452094 sequence cata-logues (779490 contigs and 1672604 singletons)in 62 plants 25 of the sequences are assignedK numbers EGENES is available at httpwwwgenomejpkeggplantpln_listhtml

4 KEGG API SOAPWSDL interface for theKEGG system

Shuichi Kawashima Toshiaki Katayama andMinoru Kanehisa

KEGG is a suite of databases and associatedsoftware integrating our current knowledge ofmolecular interactionreaction pathways andother systemic functions (PATHWAY and BRITEdatabases) the information about the genomicspace (GENES database) and information aboutthe chemical space (LIGAND and DRUG data-bases) To facilitate large-scale applications ofthe KEGG system programmatically we havebeen developing and maintaining the KEGGAPI as a stable SOAPWSDL based web serviceThe KEGG API is available at httpwwwgenomejpkeggsoap

5 KEGG DAS Comprehensive repository forcommunity genome annotation

Toshiaki Katayama Mari Watanabe andMinoru Kanehisa

KEGG DAS is an advanced genome databasesystem providing DAS (Distributed AnnotationSystem) service for all bacterial organisms in theGENOME database in KEGG Currently KEGGDAS contains over 8 million annotations as-signed to the genome sequences of 817 organ-isms (increased from 615 organisms in last year)The KEGG DAS server provides gene annota-tions linked to the KEGG PATHWAY andLIGAND databases In addition to the codinggenes information of non-coding RNAs pre-dicted using Rfam database is also provided tofill the annotation of the intergenic regions ofthe genomes The KEGG DAS service is avail-able at httpdashgcjp

6 Full-Arthropods Constructing full lengthcDNA of pathogenetic arthropods

Toshiaki Katayama Shuichi Kawashima MihoUsui Hiroyuki Wakaguri Eri KibukawaMasahide Sasaki Kazuhisa Hiranuka Ryuichiro Maeda Yutaka Suzuki SumioSugano and Junichi Watanabe

Anopheles mosquito tsetse fly tsutsugamushi-mite dust mite are arthropods which are

117

known as medically important because theseeither transmit various infectious disease includ-ing malaria Japanese river fever or cause al-lergy such as asthma and dermatitis Because ofserious medical problems they cause theirgenomes are being extensively analyzed re-cently We have produced libraries of the fourorganisms and are constructing their databasesfor the functional genome analysis Full-Arthropods is available on the site httpful-larthhgcjp

7 Full-Entamoeba a database for the fulllength cDNA library of Entamoeba

Toshiaki Katayama Kazushi Hiranuka Masahiro Kumagai Yutaka Suzuki SumioSugano Atsushi Toyoda Asao MakiokaJunichi Watanabe

Entamoeba histolytica is a protozoan parasitewhich predominantly infects humans and otherprimates and causes amebiasis E histolytica isestimated to infect about 50 million peopleworldwide and amebiasis is estimated to cause70000 deaths per year Full-Entamoeba a data-base for full-length cDNAs from a human para-site E histolytica and a reptilian parasite E in-vadens has been produced The full-lengthcDNA libraries were produced using the oligo-capping method from trophozoites of each spe-cies cultivated axenically A total of 5000 5rsquoend-one-pass sequences of cDNAs from the two spe-cies were compared with the non-redundant da-tabase of DDBLGenBankEMBL using BLASTand TBLASTX programs These clones are avail-able for further analysis and experiments Full-Entamoeba database is available at httpful-lenthgcjp

8 Analysis of sequence catalogs of thehouse dust mite Dermatophagoides fari-nae

Shuichi Kawashima Atsushi Toyoda JunichiWatanabe Sadao Nogami and MinoruKanehisa

The house dust mite is a cosmopolitan guestin human habitation and the multicellular or-ganism that is one of the most closely associatedwith our life It is now well established that thedust mites are major allergens causing bronchialasthma allergic rhinitis and atopic dermatitisDermatophagoides farinae (American house dustmite) and D pteronissinus (European house dustmite) are two most common species in the tem-perate zone We produced the cDNA libraries ofour D farinae sample containing young nymphs

and adults using the vector trapper method andsequenced the both ends of 11520 cDNA clonesCleaning clustering and assembling of the rowsequences produced 3031 contigs and 4281 sin-gletons 1797 of the total unique 7312 se-quences were assigned KEGG Orthology byKAAS system More than 30 of the sequencesshowed significant matches to KEGG GENESdatabase which includes well characterized Derf group 1 allergens We predicted 1109 peptideslonger than 20 amino acids from the 3031 con-tigs Some of the peptides are predicted to con-tain the 9-mer peptides with strong affinities toMHC class II by NetMHC 30 We expect thatthese in silico analyses will pave the way towardprediction of allergens from D farinae

9 HiGet and SSS Search engines for thelarge-scale biological databases

Toshiaki Katayama Shuichi KawashimaKazuhiro Ohi Kenta Nakai and MinoruKanehisa

Recently the number of entries in biologicaldatabases is exponentially increasing year byyear For example there were 10106023 entriesin the GenBank database in the year 2000 whichhas now grown to 98868465 (Release 169+daily updates) In order for such a vast amountof data to be searched at a high speed we havedeveloped a high performance database entryretrieval system named HiGet For this purposethe system is constructed on the HiRDB a com-mercial ORDBMS (Object-oriented RelationalDatabase Management System) developed byHitachi Ltd HiGet can perform full text searchon various biological databases including Gen-Bank RefSeq UniProt Prosite OMIM and PDBAdditional advantage of the HiGet system is thecapability of a field specific search which en-ables users to narrow down the number of re-sults especially useful for collecting sequencesof their specific needs We have also developeda sequence similarity search (SSS) service to findhomologous sequences with various algorithmsincluding BLAST FASTA SSEARCH TRANSand EXONERATE This variety of options isunique among the public services and users canselect an appropriate method to search similarsequences according to their query Because al-gorithms such as TRANS and EXONERATE arehighly time consuming the SSS service is back-ended by the distributed computing environ-ment with the Sun Grid Engine in our supercomputer system HiGet and SSS services areavailable at httphigethgcjp and httpssshgcjp respectively

118

10 Linear-time protein 3-D structure search-ing algorithm

Tetsuo Shibuya

Finding similar structures from 3-D structuredatabases of proteins (or other molecules) is be-coming one of the most important issues in thepost-genomic molecular biology To compare 3-D structures of two molecules biologists mostlyuse the RMSD (root mean square deviation) asthe similarity measure The RMSD is one of themost fundamental similarity measures used invarious fields such as computer vision and ro-botics for comparing two sets of coordinates Inthis research we propose new theoretically andpractically fast algorithms for the basic problemof finding all the substructures of structures in astructure database of chain molecules (such asproteins) whose RMSDs to the query are withina given constant threshold The best-knownworst-case time complexity for the problem is O(N log m) where N is the database size and mis the query size The previous best-known ex-pected time complexity for the problem is alsoO (N log m) In this research we propose a newbreakthrough linear-expected-time algorithm Itis not only a theoretically significant improve-ment over previous algorithms but also a prac-tically faster algorithm according to computa-tional experiments We also propose a series ofpreprocessing algorithms that enable faster que-ries though there have been no known indexingalgorithm whose query time complexity is betterthan the above O (N log m) bound One is an O(N log2 N )-time and O (N log N )-space pre-processing algorithm with expected query timecomplexity of O (m+N m 05) Another is an O (Nlog N )-time and O (N )-space preprocessing algo-rithm with expected query time complexity of O(N m 05+m log (N m))

We also extend the above linear-time algo-rithm into an algorithm with expected querytime complexity of O (m+N m 1-ε) where ε isan arbitrary small constant such that 0<ε<1We furthermore extend the above linear-time al-gorithm so that it can deal with insertions anddeletions

We checked the performance of our linear-expected-time algorithm through computationalexperiments over the whole PDB database Theexperiments show that our algorithm is muchfaster than the previous algorithms For exam-ple our algorithm is 36 to 28 times faster thanpreviously known algorithms to search for simi-lar substructures whose RMSDs are within 1Åto queries of ordinary lengths The experimentsalso show that there is consistency between theabove theoretical results and the experimental

results In other words the actual computationtime of our linear-expected-time algorithm is notinfluenced by the difference of query lengths incontrast to previous algorithms

11 Fast hinge detection algorithm in proteinstructures

Tetsuo Shibuya

Analysis of conformational changes is one ofthe keys to the understanding of protein func-tions and interactions For the analysis we oftencompare two protein structures taking flexibleregions like hinge domains into considerationThe RMSD (Root Mean Square Deviation) is themost popular measure for comparing two pro-tein structures but it is only for rigid structureswithout hinge domains In this research we pro-pose a new measure called RMSDh (Root MeanSquare Deviation considering hinges) and itsvariant RMSDh(k) for comparing two flexibleproteins with hinge domains We also proposenovel efficient algorithms for computing themwhich can detect the hinge positions at the sametime The RMSDh is suitable for cases wherethere is one small hinge domain in each of thetwo target structures The new algorithm forcomputing the RMSDh runs in linear timewhich is same as the time complexity for com-puting the RMSD and is faster than any of pre-vious algorithms for hinge detection TheRMSDh(k) is designed for comparing structureswith more than one hinge domain The RMSDh(k) measure considers at most k small hinge do-mains ie the RMSDh(k) value should be smallif the two structures are similar except for atmost k hinge domains To compute the valuewe propose an O (kn 2)-time and O (n)-space al-gorithm based on a new dynamic programmingtechnique We also test our measures againstboth flexible protein structures and non-flexibleprotein structures and show that the hinge po-sitions can be correctly detected by our algo-rithms

12 Fast flexible protein structure alignment

Kohichi Suematsu and Tetsuo Shibuya

The Hinge Detection Algorithm described insection 11 only considered rigid hinge pointsbut the hinges are sometimes bends a little by it-self which sometimes leads to inaccurate pre-diction of hinge positions Thus we incorporatedthe notion lsquobending hingersquo to detect such hingepositions We developed a very efficient heuris-tic algorithm for finding such bending hinges asthe exact algorithm for this problem requires ex-

119

ponential time For the algorithm we developeda detailed score matrix for comparing localstructures based on the naiumlve Bayse learning

13 Protein function prediction based on 3-Dstructure motifs

Chia-Han Chu Hiroki Sakai and TetsuoShibuya

Protein functions are said to be determined byits 3-D structures but not all functions havebeen known to be related to some 3-D structuremotifs The geometric suffix tree a data struc-ture for indexing 3-D protein structures whichis also developed by us enables comprehen-sively enumeration of all the possible structuralmotifs among given set of proteins We are de-veloping a new algorithm based on the supportvector machine that decides proteinrsquos functionfrom the 3-D structure of a protein This algo-rithm utilizes all the possible 3-D motifs foundby using the geometric suffix tree

14 Suffix array construction with a lazyscheme

Ben Hachimori and Tetsuo Shibuya

The suffix array is one of the most importantindexing data structures for alphabet strings in-cluding DNA sequences RNA sequences pro-tein sequences web pages Medline databaseand so on But even the most sophisticated algo-rithm for constructing the suffix array requires alot of time We developed a new efficient lazyalgorithm that computes the suffix array onlyafter we get the query By doing so we have tocompute only the necessary part of the suffix ar-ray We developed a lazy algorithm based onthe Schurmann-Stoye algorithm which is moreefficient than both Boyer-Moore algorithm andother suffix tree-based algorithms in case thenumber of queries is limited

15 Color space-DNA sequence mappingalignment algorithm

Ben Hachimori and Tetsuo Shibuya

Applied Biosystemsrsquos SOLiD system encodethe DNA sequence into a sequence of data typecalled the color space where one of 4 fluores-cent colors is assigned to each two adjacentbasersquos 16 pattern orderings However therehave been known no algorithm that alignsmaps the color-space sequence to the DNA se-quence with consideration of the difference be-tween the experimental error and the actual mu-tation We developed an alignment algorithmthat distinguishes the experimental error and ac-tual DNA mutation to align the color-space dataagainst ordinary DNA sequences Moreover wecomputed the optimal score table for the align-ment based on the actual E coli data

16 Genotype clustering based on hiddenMarkov models

Ritsuko Onuki Tetsuo Shibuya and MinoruKanehisa

Haplotype clustering is important for genemapping of human disease Although its impor-tance for the analysis it is difficult to obtainhaplotype data from present experiment for itscost and error rate Instead of haplotypes geno-types are much easier to obtain In this workwe propose a new method for clustering geno-types In the algorithm we first infer the multi-ple haplotype candidates from the genotypeand next we calculate the distance between thegenotypes based on the results of the haplotypeinference Then we perform genotype clusteringbased on the distances We evaluated our algo-rithm by applying our algorithm against severalactual genotype data

Publications

Kanehisa M Araki M Goto S Hattori MHirakawa M Itoh M Katayama TKawashima S Okuda S Tokimatsu T andYamanishi Y KEGG for linking genomes tolife and the environment Nucleic Acids Re-search 36 D480-D484 2008

Kawashima S Pokarowski P Pokarowska MKolinski A Katayama T Kanehisa MAAindex amino acid index database progressreport 2008 Nucleic Acids Research 36 D202-D205 2008

Okuda S Yamada T Hamajima M Itoh MKatayama T Bork P Goto S and KanehisaM KEGG Atlas mapping for global analysisof metabolic pathways Nucleic Acids Research36 W423-426 2008

Wakaguri H Suzuki Y Katayama TKawashima S Kibukawa E Hiranuka KSasaki M Sugano S and Watanabe J Full-MalariaParasites and Full-Arthropods data-base of full-length cDNAs of parasites and ar-thropods update 2009 Nucleic Acids Research

120

37 D520-D525 2008Yamanishi Y Araki M Gutteridge A Honda

W and Kanehisa M Prediction of drug-targetinteraction networks from the integration ofchemical and genomic spaces Bioinformatics24 i232-i240 2008

Takarabe M Okuda S Itoh M Tokimatsu TGoto S and Kanehisa M Network analysisof adverse drug interactions Genome Informat-ics 20 252-259 2008

Hashimoto K Yoshizawa AC Okuda SKuma K Goto S and Kanehisa M The rep-ertoire of desaturases and elongases revealsfatty acid variations in 56 eukaryotic genomesJ Lipid Res 49 183-191 (2008)

Shibuya T Fast Hinge Detection Algorithmsfor Flexible Protein Structures IEEEACM

Transactions on Computational Biology and Bioin-formatics to appear

Shibuya T Searching Protein 3-D Structures inLinear Time Proc 13th Annual InternationalConference on Research in Computational Molecu-lar Biology (RECOMB 2009) 2009 to appear

Shibuya T Linear-Time Algorithm for Search-ing Protein 3-D Structures IPSJ SIG Notes SI-GAL 123-4 2009 to appear

Suematsu K Shibuya T Flexible ProteinAlignment of 3D-Structures Allowing Dy-namic Transformation ISPSJ SIG Notes SIG-BIO 12-12 2008 pp 87-94本多渉田辺麻央矢野亜津子金久實バイオインフォマティクスシステムバイオロジーとKEGG生化学801094―11112008

121

The recent advances in biomedical research have been producing large-scaleultra-high dimensional ultra-heterogeneous data Due to these post-genomic re-search progresses our current mission is to create computational strategy for sys-tems biology and medicine towards translational bioinformatics With this missionwe have been developing computational methods for understanding life as systemand applying them to practical issues in medicine and biology

1 Computational Systems Biology

a Systematic reconstruction of TRANSPATHdata into Cell System Markup Language

Masao Nagasaki Ayumu Saito Chen Li EunaJeong Satoru Miyano

Many biological repositories store informationbased on experimental study of the biologicalprocesses within a cell such as protein-proteininteractions metabolic pathways signal trans-duction pathways or regulations of transcrip-tion factors and miRNA Unfortunately it is dif-ficult to directly use such information whengenerating simulation-based models Thus mod-eling rules for encoding biological knowledgeinto system-dynamics-oriented standardized for-mats would be very useful for full understand-ing of cellular dynamics at the system level Weselected the TRANSPATH database a manuallycurated high-quality pathway database whichprovides a rich source of cellular events in hu-mans mice and rats curated from over 31500papers In this work we defined 16 modeling

rules based on hybrid functional Petri net withextension (HFPNe) which is suitable for graphi-cal representation and simulation of biologicalprocesses In these modeling rules each Petrinet element is incorporated with Cell SystemOntology (CSO) to enable semantic interoper-ability of models As a formal ontology for bio-logical pathway modeling with dynamics CSOalso defines biological terminology and corre-sponding icons By combining HFPNe with theCSO features we made a method for transforming TRANSPATH data to simulation-based se-mantically valid models The results are en-coded into a biological pathway format CellSystem Markup Language (CSML) which easesthe exchange and integration of biological dataand models By using the 16 modeling rules97 of the reactions in TRANSPATH are con-verted into simulation-based models representedin CSML This reconstruction demonstrated thatit is possible to use our rules to generate quanti-tative models from static pathway descriptions

b Finding optimal Bayesian network given asuper-structure

Human Genome Center

Laboratory of DNA Information AnalysisDNA情報解析分野

Professor Satoru Miyano PhDAssociate Professor Seiya Imoto PhDAssistant Professor Masao Nagasaki PhDProject Lecturer Rui Yamaguchi PhDProject AssistantProfessor Yoshinori Tamada PhD

教 授 理学博士 宮 野 悟准教授 博士(数理学) 井 元 清 哉助 教 博士(理学) 長 正 朗特任講師 博士(理学) 山 口 類特任助教 博士(情報学) 玉 田 嘉 紀

122

Eric Perrier Seiya Imoto Satoru Miyano

Conventional approaches for learning Baye-sian network structure from data have disad-vantages in terms of complexity and lower accu-racy of their results However a recent empiri-cal study has shown that a hybrid algorithm im-proves sensitively accuracy and speed it learnsa skeleton with an independency test (IT) ap-proach and constrains on the directed acyclicgraphs considered during the search-and-scorephase Subsequently we defined the structuralconstraint by introducing the concept of super-structure S which is an undirected graph thatrestricts the search to networks whose skeletonis a subgraph of S We developed a super-structure constrained optimal search (COS) itstime complexity is upper bounded by O(γm

n)where γm<2 depends on the maximal degree mof S Empirically complexity depends on theaverage degree mrsquo and sparse structures allowlarger graphs to be calculated Our algorithm isfaster than an optimal search by several ordersand even finds more accurate results whengiven a sound super-structure Practically S canbe approximated by IT approaches significancelevel of the tests controls its sparseness enablingto control the trade-off between speed and accu-racy For incomplete super-structures a greedilypost-processed version (COS+) still enables tosignificantly outperform other heuristic searches

c Statistical inference of transcriptionalmodule-based gene networks from timecourse gene expression profiles by usingstate space models

Osamu Hirose Ryo Yoshida1 Seiya Imoto RuiYamaguchi Tomoyuki Higuchi1 D StephenCharnock-Jones2 Cristin Print3 Satoru Miy-ano 1Institute of Statistical Mathematics 2Cambridge University 3University of Auck-land

We developed a novel method based on thestate space model to identify the transcriptionalmodules and module-based gene networks si-multaneously The state space model has the po-tential to infer large-scale gene networks eg oforder 103 from time-course gene expression pro-files Particularly we succeeded in identificationof a cell cycle system by using the gene expres-sion profiles of Saccharomyces cerevisiae in whichthe length of the time-course and number ofgenes were 24 and 4382 respectively Howeverwhen analyzing shorter time-course data eg oflength 10 or less the parameter estimations ofthe state space model often fail due to overfit-ting To extend the applicability of the state

space model we provided an approach to usethe technical replicates of gene expression pro-files which are often measured in duplicate ortriplicate The use of technical replicates is im-portant for achieving highly-efficient inferenceof gene networks with short time-course dataThe potential of the proposed method weredemonstrated through the time-course analysisof the gene expression profiles of human umbili-cal vein endothelial cells undergoing growthfactor deprivation-induced apoptosis

d Predicting differences in gene regulatorysystems by state space models

Rui Yamaguchi Seiya Imoto Mai YamauchiMasao Nagasaki Ryo Yoshida1 Teppei Shima-mura Yosuke Hatanaka Kazuko Ueno To-moyuki Higuchi1 Noriko Gotoh Satoru Miy-ano

We developed a statistical method to predictdifferentially regulated genes of case and controlsamples from time-course gene expression databy leveraging unpredictability of the expressionpatterns from the underlying regulatory systeminferred by a state space model The proposedmethod can screen out genes that show differentpatterns but generated by the same regulationsin both samples since these patterns can be pre-dicted by the same model Our strategy consistsof three steps Firstly a gene regulatory systemis inferred from the control data by a state spacemodel Then the obtained model for the under-lying regulatory system of the control sample isused to predict the case data Finally by assess-ing the significance of the difference betweencase and predicted-case time-course data of eachgene we are able to detect the unpredictablegenes that are the candidate as the key differ-ences between the regulatory systems of caseand control cells We illustrate the whole proc-ess of the strategy by an actual example wherehuman small airway epithelial cell gene regula-tory systems were generated from novel timecourses of gene expressions following treatmentwith(case)without(control) the drug gefitiniban inhibitor for the epidermal growth factor re-ceptor tyrosine kinase Finally in gefitinib re-sponse data we succeeded in finding unpredict-able genes that are candidates of the specific tar-gets of gefitinib We also discussed differencesin regulatory systems for the unpredictablegenes The proposed method would be a prom-ising tool for identifying biomarkers and drugtarget genes

e Bayesian learning of biological pathwayson genomic data assimilation

123

Ryo Yoshida1 Masao Nagasaki Rui Yama-guchi Seiya Imoto Satoru Miyano TomoyukiHiguchi1

Mathematical modeling and simulation basedon biochemical rate equations provide us a rig-orous tool for unraveling complex mechanismsof biological pathways To proceed to simulationexperiments it is an essential first step to findeffective values of model parameters which aredifficult to measure from in vivo and in vitro ex-periments Furthermore once a set of hypotheti-cal models has been created any statistical crite-rion is needed to test the ability of the con-structed models and to proceed to model revi-sion We developed a new statistical technologytowards data-driven construction of in silico bio-logical pathways The method starts with aknowledge-based modeling with hybrid func-tional Petri net It then proceeds to the Bayesianlearning of model parameters for which experi-mental data are available This process exploitsquantitative measurements of evolving bio-chemical reactions eg gene expression dataAnother important issue that we consider is sta-tistical evaluation and comparison of the con-structed hypothetical pathways For this pur-pose we have developed a new Bayesianinformation-theoretic measure that assesses thepredictability and the biological robustness of insilico pathways

f Modeling nonlinear gene regulatory net-works from time series gene expressiondata

Andreacute Fujita Joatildeo Ricardo Sato5 HumbertoMiguel Garay-Malpartida5 Mari CleideSogayar5 Carlow Eduardo Ferreira5 SatoruMiyano 5University of Satildeo Paulo

In cells molecular networks such as generegulatory networks are the basis of biologicalcomplexity Therefore gene regulatory networkshave become the core of research in systems bi-ology Understanding the processes underlyingthe several extracellular regulators signal trans-duction protein-protein interactions and differ-ential gene expression processes requires de-tailed molecular description of the protein andgene networks involved To understand betterthese complex molecular networks and to infernew regulatory associations we developed astatistical method based on vector autoregres-sive models and Granger causality to estimatenonlinear gene regulatory networks from timeseries microarray data Most of the modelsavailable in the literature assume linearity in theinference of gene connections moreover these

models do not infer directionality in these con-nections Thus a priori biological knowledge isrequired However in pathological cases no apriori biological information is available Toovercome these problems we present the non-linear vector autoregressive (NVAR) model Wehave applied the NVAR model to estimate non-linear gene regulatory networks based entirelyon gene expression profiles obtained from DNAmicroarray experiments We showed the resultsobtained by NVAR through several simulationsand by the construction of three actual generegulatory networks (p53 NF-κB and c-Myc)for HeLa cells

g Fast grid layout algorithm for biologicalnetworks with sweep calculation

Kaname Kojima Masao Nagasaki Satoru Miy-ano

Properly drawn biological networks are ofgreat help in the comprehension of their charac-teristics The quality of the layouts for retrievedbiological networks is critical for pathway data-bases However since it is unrealistic to manu-ally draw biological networks for every re-trieval automatic drawing algorithms are essen-tial Grid layout algorithms handle various bio-logical properties such as aligning vertices hav-ing the same attributes and complicated posi-tional constraints according to their subcellularlocalizations thus they succeed in providingbiologically comprehensible layouts Howeverexisting grid layout algorithms are not suitablefor real-time drawing which is one of requisitesfor applications to pathway databases due totheir high-computational cost In addition theydo not consider edge directions and their result-ing layouts lack traceability for biochemical re-actions and gene regulations which are themost important features in biological networksWe devised a new calculation method termedsweep calculation and reduced the time com-plexity of the current grid layout algorithmsthrough its encoding and decoding processesWe conduct ed practical experiments by using95 pathway models of various sizes fromTRANSPATH and showed that our new gridlayout algorithm is much faster than existinggrid layout algorithms For the cost function weintroduced a new component that penalizes un-desirable edge directions to avoid the lack oftraceability in pathways due to the differencesin direction between in-edges and out-edges ofeach vertex

124

h Estimation of nonlinear gene regulatorynetworks via L1 regularized NVAR fromtime series gene expression data

Kaname Kojima Andreacute Fujita Teppei Shima-mura Seiya Imoto Satoru Miyano

Recently nonlinear vector autoregressive(NVAR) model based on Granger causality wasproposed to infer nonlinear gene regulatory net-works from time series gene expression dataSince NVAR requires a large number of parame-ters due to the basis expansion the length oftime series microarray data is insufficient for ac-curate parameter estimation and we need tolimit the size of the gene set strongly To ad-dress this limitation we employed L1 regulariza-tion technique to estimate NVAR Under L1

regularization direct parents of each gene canbe selected efficiently even when the number ofparameters exceeds the number of data samplesWe can thus estimate larger gene regulatory net-works more accurately than those from existingmethods Through the simulation study weverified the effectiveness of the proposedmethod by comparing its limitation in the num-ber of genes to that of the existing NVAR Theproposed method was also applied to time se-ries microarray data of Human hela cell cycle

i Multivariate gene expression analysis re-veals functional connectivity changes be-tween normaltumoral prostates

Andreacute Fujita Luciana Rodrigues Gomes5 JoatildeoRicardo Sato6 Rui Yamaguchi Carlos Edu-ardo Thomaz7 Mari Cleide Sogayar5 SatoruMiyano 6Universidade Federal do ABC 7Cen-tro Universitaacuterio da FEI

Principal Component Analysis (PCA) com-bined with the Maximum-entropy Linear Dis-criminant Analysis (MLDA) was applied in or-der to identify genes with the most discrimina-tive information between normal and tumoralprostatic tissues Data analysis was carried outusing three different approaches namely (i) dif-ferences in gene expression levels between nor-mal and tumoral conditions from a univariatepoint of view (ii) in a multivariate fashion usingMLDA and (iii) with a dependence network ap-proach Our results show that malignant trans-formation in the prostatic tissue is more relatedto functional connectivity changes in their de-pendence networks than to differential gene ex-pression The MYLK KLK2 KLK3 HAN11LTF CSRP1 and TGM4 genes presented signifi-cant changes in their functional connectivity be-tween normal and tumoral conditions and were

also classified as the top seven most informativegenes for the prostate cancer genesis process byour discriminant analysis Moreover among theidentified genes we found classically knownbiomarkers and genes which are closely relatedto tumoral prostate such as KLK3 and KLK2and several other potential ones We have dem-onstrated that changes in functional connectivitymay be implicit in the biological process whichrenders some genes more informative to dis-criminate between normal and tumoral condi-tions Using the proposed method namelyMLDA in order to analyze the multivariatecharacteristic of genes it was possible to capturethe changes in dependence networks which arerelated to cell transformation

j Rule-based reasoning for system dynam-ics in cell systems

Euna Jeong Masao Nagasaki Satoru Miyano

A system-dynamics-centered ontology calledthe Cell System Ontology (CSO) has been de-veloped for representation of diverse biologicalpathways Many of the pathway data based onthe ontology have been created from databasesvia data conversion or curated by expert biolo-gists It is essential to validate the pathway datawhich may cause unexpected issues such as se-mantic inconsistency and incompleteness Thispaper discusses three criteria for validating thepathway data based on CSO as follows (1)structurally correct models in terms of Petrinets (2) biologically correct models to capturebiological meaning and (3) systematically cor-rect models to reflect biological behaviors Si-multaneously we have investigated how logic-based rules can be used for the ontology to ex-tend its expressiveness and to complement theontology by reasoning which aims at qualifyingpathway knowledge Finally we show how theproposed approach helps exploring dynamicmodeling and simulation tasks without priorknowledge

k A novel strategy to search conserved tran-scription factor binding sites among coex-pressing genes in human

Yosuke Hatanaka Masao Nagasaki Rui Yam-aguchi Takeshi Obayashi Kazuyuki NumataAndreacute Fujita Teppei Shimamura YoshinoriTamada Seiya Imoto Kengo Kinoshita KentaNakai Satoru Miyano

We reported various transcription factor bind-ing sites (TFBSs) conserved among co-expressedgenes in human promoter region using expres-

125

sion and genomic data Assuming similar pro-moter structure induces similar transcriptionalregulation hence induces similar expressionprofile we compared the promoter structuresimilarities between co-expressed genes Com-prehensive TF binding site predictions for allhuman genes were conducted for 19777 pro-moter regions around the transcription start site(TSS) given from DBTSS and promoter similar-ity search were conducted among coexpressinggenes data provided from newly developedCOXPRESdb Combination of Position WeightMatrix (PWM) motif prediction and bootstrapmethod 7313 genes have at least one statisti-cally significant conserved TFBS We also ap-plied basket method analysis for seeking combi-natorial activities of those conserved TFBSs

l Simulation analysis for the effect of light-dark cycle on the entrainment in circadianrhythm

Natumi Mitou8 Yuto Ikegami8 Hiroshi Mat-suno8 Satoru Miyano Shin-ichi T Inouye88Yamaguchi University

Circadian rhythms of the living organisms are24hr oscillations found in behavior biochemistryand physiology Under constant conditions therhythms continue with their intrinsic periodlength which are rarely exact 24hr In this pa-per we examine the effects of light on the phaseof the gene expression rhythms derived fromthe interacting feedback network of a few clockgenes taking advantage of a computer simula-tion with Cell Illustrator The simulation resultssuggested that the interacting circadian feedbacknetwork at the molecular level is essential forphase dependence of the light effects observedin mammalian behavior Furthermore the simu-lation reproduced the biological observationsthat the range of entrainment to shorter orlonger than 24hr light-dark cycles is limitedcentering around 24hr Application of our modelto inter-time zone flight successfully demon-strated that 6 to 7 days are required to recoverfrom jet lag when traveling from Tokyo to NewYork

2 Statistical and Computational KnowledgeDiscovery

a Nonlinear regression modeling via regular-ized radial basis function networks

Tomohiro Ando9 Sadanori Konishi10 SeiyaImoto 9Keio University 10Kyushu University

The problem of constructing nonlinear regres-

sion models is investigated to analyze data withcomplex structure We introduced radial basisfunctions with hyperparameter that adjusts theamount of overlapping basis functions andadopts the information of the input and re-sponse variables By using the radial basis func-tions we constructed nonlinear regression mod-els with help of the technique of regularizationCrucial issues in the model building process arethe choices of a hyperparameter the number ofbasis functions and a smoothing parameter Wepresent information-theoretic criteria for evaluat-ing statistical models under model misspecifica-tion both for distributional and structural as-sumptions We used real data examples andMonte Carlo simulations to investigate the prop-erties of the proposed nonlinear regression mod-eling techniques The simulation results showedthat our nonlinear modeling performs well invarious situations and clear improvements wereobtained for the use of the hyperparameter inthe basis functions

b The GC and window-averaged DNA curva-ture profile of secondary metabolite genecluster in Aspergillus fumigatus genome

Jin Hwan Do Satoru Miyano

An immense variety of complex secondarymetabolites is produced by filamentous fungi in-cluding Aspergillus fumigatus a main inducer ofinvasive aspergillosis The identification of fun-gal secondary metabolite gene cluster is essen-tial for the characterization of fungal secondarymetabolism in terms of genetics and biochemis-try through recombinant technologies such asgene disruption and cloning Most of the predic-tion methods for secondary metabolite genecluster severely depend on homology searchesHowever homology-based approach has intrin-sic limitation to unknown or novel gene clusterWe analyzed the GC and window-averagedDNA curvature profile of 26 secondary metabo-lite gene clusters in the A fumigatus genome tofind out potential conserved features of secon-dary metabolite gene cluster Fifteen secondarymetabolite gene clusters showed a conservedpattern in window-averaged DNA curvatureprofile that is the DNA regions including sec-ondary metabolic signature genes such aspolyketide synthase nonribosomal peptide syn-thase andor dimethylallyl tryptophan synthaseconsisted of window-averaged DNA curvaturevalues lower than 018 and these DNA regionswere at least 20 kb Forty percent of secondarymetabolite gene clusters with this conserved pat-tern were related to severe regulation by a tran-scription factor LaeA Our result could be used

126

for identification of other fungal secondary me-tabolite gene clusters especially for secondarymetabolite gene cluster that is severely regulatedby LaeA or other proteins with similar functionto LaeA

c ExonMiner Web service for analysis ofGeneChip exon array data

Kazuyuki Numata Ryo Yoshida1 Masao Na-gasaki Ayumu Saito Seiya Imoto Satoru Miy-ano

Some splicing isoform-specific transcriptionalregulations are related to disease Therefore de-tection of disease specific splice variations is thefirst step for finding disease specific transcrip-tional regulations Affymetrix Human Exon 10ST Array can measure exon-level expressionprofiles that are suitable to find differentially ex-pressed exons in genome-wide scale Howeverexon array produces massive datasets that aremore than we can handle and analyze on per-sonal computer We have developed ExonMiner

that is the first all-in-one web service for analy-sis of exon array data to detect transcripts thathave significantly different splicing patterns intwo cells eg normal and cancer cells Exon-Miner can perform the following analyses (1)data normalization (2) statistical analysis basedon two-way ANOVA (3) finding transcriptswith significantly different splice patterns (4) ef-ficient visualization based on heatmaps and bar-plots and (5) meta-analysis to detect exon levelbiomarkers We implemented ExonMiner on thesupercomputer system of Human Genome Cen-ter in order to perform genome-wide analysisfor more than 300000 transcripts in exon arraydata which has the potential to reveal the aber-rant splice variations in cancer cells as exonlevel biomarkers ExonMiner is well suited foranalysis of exon array data and does not requireany installation of software except for internetbrowsers The URL of ExonMiner is httpaehgcjpexonminer Users can analyze full datasetof exon array data within hours by high-levelstatistical analysis with sound theoretical basisthat finds aberrant splice variants as biomarkers

Publications

1 Ando T Konishi S Imoto S Nonlinear re-gression modeling via regularized radial ba-sis function networks Journal of StatisticalPlanning and Inference 138 (11) 3616-36332008

2 Brazma A Miyano S Akutsu T Proceed-ings of the 6th Asia-Pacific BioinformaticsConference (APBC 2008) Imperial CollegePress 2008

3 Do JH Miyano S The GC and window-averaged DNA curvature profile of secon-dary metabolite gene cluster in Aspergillusfumigatus genome Applied Microbiologyand Biotechnology 80 (5) 841-847 2008

4 Fujita A Gomes LR Sato JR Yama-guchi R Thomaz CE Sogayar MC Miy-ano S Multivariate gene expression analysisreveals functional connectivity changes be-tween normaltumoral prostates BMC Sys-tems Biology 2 106 2008

5 Fujita A Sato JR Garay-Malpartida HM Sogayar MC Ferreira CE Miyano SModeling nonlinear gene regulatory net-works from time series gene expressiondata J Bioinformatics and ComputationalBiology 6 (5) 961-979 2008

6 Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Fujita A Shima-mura T Tamada Y Imoto S KinoshitaK Nakai K Miyano S A novel strategy tosearch conserved transcription factor bind-

ing sites among coexpressing genes in hu-man Genome Informatics 20 212-221 2008

7 Hirose O Yoshida R Imoto S Yama-guchi R Higuchi T Charnock-Jones DSPrint C Miyano S Statistical inference oftranscriptional module-based gene networksfrom time course gene expression profiles byusing state space models Bioinformatics 24(7) 932-942 2008

8 Hirose O Yoshida R Yamaguchi RImoto S Higuchi T Miyano S Analyzingtime course gene expression data with bio-logical and technical replicates to estimategene networks by state space models Proc2nd Asia International Conference on Mod-elling amp Simulation 940-946 2008 (AMS2008 Refereed conference)

9 Jeong E Nagasaki M Miyano S Rule-based reasoning for system dynamics in cellsystems Genome Informatics 20 25-362008

10 Kitakaze H Kanda M Nakatsuka HIkeda N Matsuno H Miyano S Predic-tion of fragile points for robustness checkingof cell systems IEICE TRANSACTIONS onInformation and Systems D J91-D (9) 2404-2417 2008

11 Knapp E-W Benson G Holzhutter H-GKanehisa M Miyano S (Eds) Genome In-formatics 20 2008

12 Kojima K Fujita A Shimamura T Imoto

127

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 3: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

known as medically important because theseeither transmit various infectious disease includ-ing malaria Japanese river fever or cause al-lergy such as asthma and dermatitis Because ofserious medical problems they cause theirgenomes are being extensively analyzed re-cently We have produced libraries of the fourorganisms and are constructing their databasesfor the functional genome analysis Full-Arthropods is available on the site httpful-larthhgcjp

7 Full-Entamoeba a database for the fulllength cDNA library of Entamoeba

Toshiaki Katayama Kazushi Hiranuka Masahiro Kumagai Yutaka Suzuki SumioSugano Atsushi Toyoda Asao MakiokaJunichi Watanabe

Entamoeba histolytica is a protozoan parasitewhich predominantly infects humans and otherprimates and causes amebiasis E histolytica isestimated to infect about 50 million peopleworldwide and amebiasis is estimated to cause70000 deaths per year Full-Entamoeba a data-base for full-length cDNAs from a human para-site E histolytica and a reptilian parasite E in-vadens has been produced The full-lengthcDNA libraries were produced using the oligo-capping method from trophozoites of each spe-cies cultivated axenically A total of 5000 5rsquoend-one-pass sequences of cDNAs from the two spe-cies were compared with the non-redundant da-tabase of DDBLGenBankEMBL using BLASTand TBLASTX programs These clones are avail-able for further analysis and experiments Full-Entamoeba database is available at httpful-lenthgcjp

8 Analysis of sequence catalogs of thehouse dust mite Dermatophagoides fari-nae

Shuichi Kawashima Atsushi Toyoda JunichiWatanabe Sadao Nogami and MinoruKanehisa

The house dust mite is a cosmopolitan guestin human habitation and the multicellular or-ganism that is one of the most closely associatedwith our life It is now well established that thedust mites are major allergens causing bronchialasthma allergic rhinitis and atopic dermatitisDermatophagoides farinae (American house dustmite) and D pteronissinus (European house dustmite) are two most common species in the tem-perate zone We produced the cDNA libraries ofour D farinae sample containing young nymphs

and adults using the vector trapper method andsequenced the both ends of 11520 cDNA clonesCleaning clustering and assembling of the rowsequences produced 3031 contigs and 4281 sin-gletons 1797 of the total unique 7312 se-quences were assigned KEGG Orthology byKAAS system More than 30 of the sequencesshowed significant matches to KEGG GENESdatabase which includes well characterized Derf group 1 allergens We predicted 1109 peptideslonger than 20 amino acids from the 3031 con-tigs Some of the peptides are predicted to con-tain the 9-mer peptides with strong affinities toMHC class II by NetMHC 30 We expect thatthese in silico analyses will pave the way towardprediction of allergens from D farinae

9 HiGet and SSS Search engines for thelarge-scale biological databases

Toshiaki Katayama Shuichi KawashimaKazuhiro Ohi Kenta Nakai and MinoruKanehisa

Recently the number of entries in biologicaldatabases is exponentially increasing year byyear For example there were 10106023 entriesin the GenBank database in the year 2000 whichhas now grown to 98868465 (Release 169+daily updates) In order for such a vast amountof data to be searched at a high speed we havedeveloped a high performance database entryretrieval system named HiGet For this purposethe system is constructed on the HiRDB a com-mercial ORDBMS (Object-oriented RelationalDatabase Management System) developed byHitachi Ltd HiGet can perform full text searchon various biological databases including Gen-Bank RefSeq UniProt Prosite OMIM and PDBAdditional advantage of the HiGet system is thecapability of a field specific search which en-ables users to narrow down the number of re-sults especially useful for collecting sequencesof their specific needs We have also developeda sequence similarity search (SSS) service to findhomologous sequences with various algorithmsincluding BLAST FASTA SSEARCH TRANSand EXONERATE This variety of options isunique among the public services and users canselect an appropriate method to search similarsequences according to their query Because al-gorithms such as TRANS and EXONERATE arehighly time consuming the SSS service is back-ended by the distributed computing environ-ment with the Sun Grid Engine in our supercomputer system HiGet and SSS services areavailable at httphigethgcjp and httpssshgcjp respectively

118

10 Linear-time protein 3-D structure search-ing algorithm

Tetsuo Shibuya

Finding similar structures from 3-D structuredatabases of proteins (or other molecules) is be-coming one of the most important issues in thepost-genomic molecular biology To compare 3-D structures of two molecules biologists mostlyuse the RMSD (root mean square deviation) asthe similarity measure The RMSD is one of themost fundamental similarity measures used invarious fields such as computer vision and ro-botics for comparing two sets of coordinates Inthis research we propose new theoretically andpractically fast algorithms for the basic problemof finding all the substructures of structures in astructure database of chain molecules (such asproteins) whose RMSDs to the query are withina given constant threshold The best-knownworst-case time complexity for the problem is O(N log m) where N is the database size and mis the query size The previous best-known ex-pected time complexity for the problem is alsoO (N log m) In this research we propose a newbreakthrough linear-expected-time algorithm Itis not only a theoretically significant improve-ment over previous algorithms but also a prac-tically faster algorithm according to computa-tional experiments We also propose a series ofpreprocessing algorithms that enable faster que-ries though there have been no known indexingalgorithm whose query time complexity is betterthan the above O (N log m) bound One is an O(N log2 N )-time and O (N log N )-space pre-processing algorithm with expected query timecomplexity of O (m+N m 05) Another is an O (Nlog N )-time and O (N )-space preprocessing algo-rithm with expected query time complexity of O(N m 05+m log (N m))

We also extend the above linear-time algo-rithm into an algorithm with expected querytime complexity of O (m+N m 1-ε) where ε isan arbitrary small constant such that 0<ε<1We furthermore extend the above linear-time al-gorithm so that it can deal with insertions anddeletions

We checked the performance of our linear-expected-time algorithm through computationalexperiments over the whole PDB database Theexperiments show that our algorithm is muchfaster than the previous algorithms For exam-ple our algorithm is 36 to 28 times faster thanpreviously known algorithms to search for simi-lar substructures whose RMSDs are within 1Åto queries of ordinary lengths The experimentsalso show that there is consistency between theabove theoretical results and the experimental

results In other words the actual computationtime of our linear-expected-time algorithm is notinfluenced by the difference of query lengths incontrast to previous algorithms

11 Fast hinge detection algorithm in proteinstructures

Tetsuo Shibuya

Analysis of conformational changes is one ofthe keys to the understanding of protein func-tions and interactions For the analysis we oftencompare two protein structures taking flexibleregions like hinge domains into considerationThe RMSD (Root Mean Square Deviation) is themost popular measure for comparing two pro-tein structures but it is only for rigid structureswithout hinge domains In this research we pro-pose a new measure called RMSDh (Root MeanSquare Deviation considering hinges) and itsvariant RMSDh(k) for comparing two flexibleproteins with hinge domains We also proposenovel efficient algorithms for computing themwhich can detect the hinge positions at the sametime The RMSDh is suitable for cases wherethere is one small hinge domain in each of thetwo target structures The new algorithm forcomputing the RMSDh runs in linear timewhich is same as the time complexity for com-puting the RMSD and is faster than any of pre-vious algorithms for hinge detection TheRMSDh(k) is designed for comparing structureswith more than one hinge domain The RMSDh(k) measure considers at most k small hinge do-mains ie the RMSDh(k) value should be smallif the two structures are similar except for atmost k hinge domains To compute the valuewe propose an O (kn 2)-time and O (n)-space al-gorithm based on a new dynamic programmingtechnique We also test our measures againstboth flexible protein structures and non-flexibleprotein structures and show that the hinge po-sitions can be correctly detected by our algo-rithms

12 Fast flexible protein structure alignment

Kohichi Suematsu and Tetsuo Shibuya

The Hinge Detection Algorithm described insection 11 only considered rigid hinge pointsbut the hinges are sometimes bends a little by it-self which sometimes leads to inaccurate pre-diction of hinge positions Thus we incorporatedthe notion lsquobending hingersquo to detect such hingepositions We developed a very efficient heuris-tic algorithm for finding such bending hinges asthe exact algorithm for this problem requires ex-

119

ponential time For the algorithm we developeda detailed score matrix for comparing localstructures based on the naiumlve Bayse learning

13 Protein function prediction based on 3-Dstructure motifs

Chia-Han Chu Hiroki Sakai and TetsuoShibuya

Protein functions are said to be determined byits 3-D structures but not all functions havebeen known to be related to some 3-D structuremotifs The geometric suffix tree a data struc-ture for indexing 3-D protein structures whichis also developed by us enables comprehen-sively enumeration of all the possible structuralmotifs among given set of proteins We are de-veloping a new algorithm based on the supportvector machine that decides proteinrsquos functionfrom the 3-D structure of a protein This algo-rithm utilizes all the possible 3-D motifs foundby using the geometric suffix tree

14 Suffix array construction with a lazyscheme

Ben Hachimori and Tetsuo Shibuya

The suffix array is one of the most importantindexing data structures for alphabet strings in-cluding DNA sequences RNA sequences pro-tein sequences web pages Medline databaseand so on But even the most sophisticated algo-rithm for constructing the suffix array requires alot of time We developed a new efficient lazyalgorithm that computes the suffix array onlyafter we get the query By doing so we have tocompute only the necessary part of the suffix ar-ray We developed a lazy algorithm based onthe Schurmann-Stoye algorithm which is moreefficient than both Boyer-Moore algorithm andother suffix tree-based algorithms in case thenumber of queries is limited

15 Color space-DNA sequence mappingalignment algorithm

Ben Hachimori and Tetsuo Shibuya

Applied Biosystemsrsquos SOLiD system encodethe DNA sequence into a sequence of data typecalled the color space where one of 4 fluores-cent colors is assigned to each two adjacentbasersquos 16 pattern orderings However therehave been known no algorithm that alignsmaps the color-space sequence to the DNA se-quence with consideration of the difference be-tween the experimental error and the actual mu-tation We developed an alignment algorithmthat distinguishes the experimental error and ac-tual DNA mutation to align the color-space dataagainst ordinary DNA sequences Moreover wecomputed the optimal score table for the align-ment based on the actual E coli data

16 Genotype clustering based on hiddenMarkov models

Ritsuko Onuki Tetsuo Shibuya and MinoruKanehisa

Haplotype clustering is important for genemapping of human disease Although its impor-tance for the analysis it is difficult to obtainhaplotype data from present experiment for itscost and error rate Instead of haplotypes geno-types are much easier to obtain In this workwe propose a new method for clustering geno-types In the algorithm we first infer the multi-ple haplotype candidates from the genotypeand next we calculate the distance between thegenotypes based on the results of the haplotypeinference Then we perform genotype clusteringbased on the distances We evaluated our algo-rithm by applying our algorithm against severalactual genotype data

Publications

Kanehisa M Araki M Goto S Hattori MHirakawa M Itoh M Katayama TKawashima S Okuda S Tokimatsu T andYamanishi Y KEGG for linking genomes tolife and the environment Nucleic Acids Re-search 36 D480-D484 2008

Kawashima S Pokarowski P Pokarowska MKolinski A Katayama T Kanehisa MAAindex amino acid index database progressreport 2008 Nucleic Acids Research 36 D202-D205 2008

Okuda S Yamada T Hamajima M Itoh MKatayama T Bork P Goto S and KanehisaM KEGG Atlas mapping for global analysisof metabolic pathways Nucleic Acids Research36 W423-426 2008

Wakaguri H Suzuki Y Katayama TKawashima S Kibukawa E Hiranuka KSasaki M Sugano S and Watanabe J Full-MalariaParasites and Full-Arthropods data-base of full-length cDNAs of parasites and ar-thropods update 2009 Nucleic Acids Research

120

37 D520-D525 2008Yamanishi Y Araki M Gutteridge A Honda

W and Kanehisa M Prediction of drug-targetinteraction networks from the integration ofchemical and genomic spaces Bioinformatics24 i232-i240 2008

Takarabe M Okuda S Itoh M Tokimatsu TGoto S and Kanehisa M Network analysisof adverse drug interactions Genome Informat-ics 20 252-259 2008

Hashimoto K Yoshizawa AC Okuda SKuma K Goto S and Kanehisa M The rep-ertoire of desaturases and elongases revealsfatty acid variations in 56 eukaryotic genomesJ Lipid Res 49 183-191 (2008)

Shibuya T Fast Hinge Detection Algorithmsfor Flexible Protein Structures IEEEACM

Transactions on Computational Biology and Bioin-formatics to appear

Shibuya T Searching Protein 3-D Structures inLinear Time Proc 13th Annual InternationalConference on Research in Computational Molecu-lar Biology (RECOMB 2009) 2009 to appear

Shibuya T Linear-Time Algorithm for Search-ing Protein 3-D Structures IPSJ SIG Notes SI-GAL 123-4 2009 to appear

Suematsu K Shibuya T Flexible ProteinAlignment of 3D-Structures Allowing Dy-namic Transformation ISPSJ SIG Notes SIG-BIO 12-12 2008 pp 87-94本多渉田辺麻央矢野亜津子金久實バイオインフォマティクスシステムバイオロジーとKEGG生化学801094―11112008

121

The recent advances in biomedical research have been producing large-scaleultra-high dimensional ultra-heterogeneous data Due to these post-genomic re-search progresses our current mission is to create computational strategy for sys-tems biology and medicine towards translational bioinformatics With this missionwe have been developing computational methods for understanding life as systemand applying them to practical issues in medicine and biology

1 Computational Systems Biology

a Systematic reconstruction of TRANSPATHdata into Cell System Markup Language

Masao Nagasaki Ayumu Saito Chen Li EunaJeong Satoru Miyano

Many biological repositories store informationbased on experimental study of the biologicalprocesses within a cell such as protein-proteininteractions metabolic pathways signal trans-duction pathways or regulations of transcrip-tion factors and miRNA Unfortunately it is dif-ficult to directly use such information whengenerating simulation-based models Thus mod-eling rules for encoding biological knowledgeinto system-dynamics-oriented standardized for-mats would be very useful for full understand-ing of cellular dynamics at the system level Weselected the TRANSPATH database a manuallycurated high-quality pathway database whichprovides a rich source of cellular events in hu-mans mice and rats curated from over 31500papers In this work we defined 16 modeling

rules based on hybrid functional Petri net withextension (HFPNe) which is suitable for graphi-cal representation and simulation of biologicalprocesses In these modeling rules each Petrinet element is incorporated with Cell SystemOntology (CSO) to enable semantic interoper-ability of models As a formal ontology for bio-logical pathway modeling with dynamics CSOalso defines biological terminology and corre-sponding icons By combining HFPNe with theCSO features we made a method for transforming TRANSPATH data to simulation-based se-mantically valid models The results are en-coded into a biological pathway format CellSystem Markup Language (CSML) which easesthe exchange and integration of biological dataand models By using the 16 modeling rules97 of the reactions in TRANSPATH are con-verted into simulation-based models representedin CSML This reconstruction demonstrated thatit is possible to use our rules to generate quanti-tative models from static pathway descriptions

b Finding optimal Bayesian network given asuper-structure

Human Genome Center

Laboratory of DNA Information AnalysisDNA情報解析分野

Professor Satoru Miyano PhDAssociate Professor Seiya Imoto PhDAssistant Professor Masao Nagasaki PhDProject Lecturer Rui Yamaguchi PhDProject AssistantProfessor Yoshinori Tamada PhD

教 授 理学博士 宮 野 悟准教授 博士(数理学) 井 元 清 哉助 教 博士(理学) 長 正 朗特任講師 博士(理学) 山 口 類特任助教 博士(情報学) 玉 田 嘉 紀

122

Eric Perrier Seiya Imoto Satoru Miyano

Conventional approaches for learning Baye-sian network structure from data have disad-vantages in terms of complexity and lower accu-racy of their results However a recent empiri-cal study has shown that a hybrid algorithm im-proves sensitively accuracy and speed it learnsa skeleton with an independency test (IT) ap-proach and constrains on the directed acyclicgraphs considered during the search-and-scorephase Subsequently we defined the structuralconstraint by introducing the concept of super-structure S which is an undirected graph thatrestricts the search to networks whose skeletonis a subgraph of S We developed a super-structure constrained optimal search (COS) itstime complexity is upper bounded by O(γm

n)where γm<2 depends on the maximal degree mof S Empirically complexity depends on theaverage degree mrsquo and sparse structures allowlarger graphs to be calculated Our algorithm isfaster than an optimal search by several ordersand even finds more accurate results whengiven a sound super-structure Practically S canbe approximated by IT approaches significancelevel of the tests controls its sparseness enablingto control the trade-off between speed and accu-racy For incomplete super-structures a greedilypost-processed version (COS+) still enables tosignificantly outperform other heuristic searches

c Statistical inference of transcriptionalmodule-based gene networks from timecourse gene expression profiles by usingstate space models

Osamu Hirose Ryo Yoshida1 Seiya Imoto RuiYamaguchi Tomoyuki Higuchi1 D StephenCharnock-Jones2 Cristin Print3 Satoru Miy-ano 1Institute of Statistical Mathematics 2Cambridge University 3University of Auck-land

We developed a novel method based on thestate space model to identify the transcriptionalmodules and module-based gene networks si-multaneously The state space model has the po-tential to infer large-scale gene networks eg oforder 103 from time-course gene expression pro-files Particularly we succeeded in identificationof a cell cycle system by using the gene expres-sion profiles of Saccharomyces cerevisiae in whichthe length of the time-course and number ofgenes were 24 and 4382 respectively Howeverwhen analyzing shorter time-course data eg oflength 10 or less the parameter estimations ofthe state space model often fail due to overfit-ting To extend the applicability of the state

space model we provided an approach to usethe technical replicates of gene expression pro-files which are often measured in duplicate ortriplicate The use of technical replicates is im-portant for achieving highly-efficient inferenceof gene networks with short time-course dataThe potential of the proposed method weredemonstrated through the time-course analysisof the gene expression profiles of human umbili-cal vein endothelial cells undergoing growthfactor deprivation-induced apoptosis

d Predicting differences in gene regulatorysystems by state space models

Rui Yamaguchi Seiya Imoto Mai YamauchiMasao Nagasaki Ryo Yoshida1 Teppei Shima-mura Yosuke Hatanaka Kazuko Ueno To-moyuki Higuchi1 Noriko Gotoh Satoru Miy-ano

We developed a statistical method to predictdifferentially regulated genes of case and controlsamples from time-course gene expression databy leveraging unpredictability of the expressionpatterns from the underlying regulatory systeminferred by a state space model The proposedmethod can screen out genes that show differentpatterns but generated by the same regulationsin both samples since these patterns can be pre-dicted by the same model Our strategy consistsof three steps Firstly a gene regulatory systemis inferred from the control data by a state spacemodel Then the obtained model for the under-lying regulatory system of the control sample isused to predict the case data Finally by assess-ing the significance of the difference betweencase and predicted-case time-course data of eachgene we are able to detect the unpredictablegenes that are the candidate as the key differ-ences between the regulatory systems of caseand control cells We illustrate the whole proc-ess of the strategy by an actual example wherehuman small airway epithelial cell gene regula-tory systems were generated from novel timecourses of gene expressions following treatmentwith(case)without(control) the drug gefitiniban inhibitor for the epidermal growth factor re-ceptor tyrosine kinase Finally in gefitinib re-sponse data we succeeded in finding unpredict-able genes that are candidates of the specific tar-gets of gefitinib We also discussed differencesin regulatory systems for the unpredictablegenes The proposed method would be a prom-ising tool for identifying biomarkers and drugtarget genes

e Bayesian learning of biological pathwayson genomic data assimilation

123

Ryo Yoshida1 Masao Nagasaki Rui Yama-guchi Seiya Imoto Satoru Miyano TomoyukiHiguchi1

Mathematical modeling and simulation basedon biochemical rate equations provide us a rig-orous tool for unraveling complex mechanismsof biological pathways To proceed to simulationexperiments it is an essential first step to findeffective values of model parameters which aredifficult to measure from in vivo and in vitro ex-periments Furthermore once a set of hypotheti-cal models has been created any statistical crite-rion is needed to test the ability of the con-structed models and to proceed to model revi-sion We developed a new statistical technologytowards data-driven construction of in silico bio-logical pathways The method starts with aknowledge-based modeling with hybrid func-tional Petri net It then proceeds to the Bayesianlearning of model parameters for which experi-mental data are available This process exploitsquantitative measurements of evolving bio-chemical reactions eg gene expression dataAnother important issue that we consider is sta-tistical evaluation and comparison of the con-structed hypothetical pathways For this pur-pose we have developed a new Bayesianinformation-theoretic measure that assesses thepredictability and the biological robustness of insilico pathways

f Modeling nonlinear gene regulatory net-works from time series gene expressiondata

Andreacute Fujita Joatildeo Ricardo Sato5 HumbertoMiguel Garay-Malpartida5 Mari CleideSogayar5 Carlow Eduardo Ferreira5 SatoruMiyano 5University of Satildeo Paulo

In cells molecular networks such as generegulatory networks are the basis of biologicalcomplexity Therefore gene regulatory networkshave become the core of research in systems bi-ology Understanding the processes underlyingthe several extracellular regulators signal trans-duction protein-protein interactions and differ-ential gene expression processes requires de-tailed molecular description of the protein andgene networks involved To understand betterthese complex molecular networks and to infernew regulatory associations we developed astatistical method based on vector autoregres-sive models and Granger causality to estimatenonlinear gene regulatory networks from timeseries microarray data Most of the modelsavailable in the literature assume linearity in theinference of gene connections moreover these

models do not infer directionality in these con-nections Thus a priori biological knowledge isrequired However in pathological cases no apriori biological information is available Toovercome these problems we present the non-linear vector autoregressive (NVAR) model Wehave applied the NVAR model to estimate non-linear gene regulatory networks based entirelyon gene expression profiles obtained from DNAmicroarray experiments We showed the resultsobtained by NVAR through several simulationsand by the construction of three actual generegulatory networks (p53 NF-κB and c-Myc)for HeLa cells

g Fast grid layout algorithm for biologicalnetworks with sweep calculation

Kaname Kojima Masao Nagasaki Satoru Miy-ano

Properly drawn biological networks are ofgreat help in the comprehension of their charac-teristics The quality of the layouts for retrievedbiological networks is critical for pathway data-bases However since it is unrealistic to manu-ally draw biological networks for every re-trieval automatic drawing algorithms are essen-tial Grid layout algorithms handle various bio-logical properties such as aligning vertices hav-ing the same attributes and complicated posi-tional constraints according to their subcellularlocalizations thus they succeed in providingbiologically comprehensible layouts Howeverexisting grid layout algorithms are not suitablefor real-time drawing which is one of requisitesfor applications to pathway databases due totheir high-computational cost In addition theydo not consider edge directions and their result-ing layouts lack traceability for biochemical re-actions and gene regulations which are themost important features in biological networksWe devised a new calculation method termedsweep calculation and reduced the time com-plexity of the current grid layout algorithmsthrough its encoding and decoding processesWe conduct ed practical experiments by using95 pathway models of various sizes fromTRANSPATH and showed that our new gridlayout algorithm is much faster than existinggrid layout algorithms For the cost function weintroduced a new component that penalizes un-desirable edge directions to avoid the lack oftraceability in pathways due to the differencesin direction between in-edges and out-edges ofeach vertex

124

h Estimation of nonlinear gene regulatorynetworks via L1 regularized NVAR fromtime series gene expression data

Kaname Kojima Andreacute Fujita Teppei Shima-mura Seiya Imoto Satoru Miyano

Recently nonlinear vector autoregressive(NVAR) model based on Granger causality wasproposed to infer nonlinear gene regulatory net-works from time series gene expression dataSince NVAR requires a large number of parame-ters due to the basis expansion the length oftime series microarray data is insufficient for ac-curate parameter estimation and we need tolimit the size of the gene set strongly To ad-dress this limitation we employed L1 regulariza-tion technique to estimate NVAR Under L1

regularization direct parents of each gene canbe selected efficiently even when the number ofparameters exceeds the number of data samplesWe can thus estimate larger gene regulatory net-works more accurately than those from existingmethods Through the simulation study weverified the effectiveness of the proposedmethod by comparing its limitation in the num-ber of genes to that of the existing NVAR Theproposed method was also applied to time se-ries microarray data of Human hela cell cycle

i Multivariate gene expression analysis re-veals functional connectivity changes be-tween normaltumoral prostates

Andreacute Fujita Luciana Rodrigues Gomes5 JoatildeoRicardo Sato6 Rui Yamaguchi Carlos Edu-ardo Thomaz7 Mari Cleide Sogayar5 SatoruMiyano 6Universidade Federal do ABC 7Cen-tro Universitaacuterio da FEI

Principal Component Analysis (PCA) com-bined with the Maximum-entropy Linear Dis-criminant Analysis (MLDA) was applied in or-der to identify genes with the most discrimina-tive information between normal and tumoralprostatic tissues Data analysis was carried outusing three different approaches namely (i) dif-ferences in gene expression levels between nor-mal and tumoral conditions from a univariatepoint of view (ii) in a multivariate fashion usingMLDA and (iii) with a dependence network ap-proach Our results show that malignant trans-formation in the prostatic tissue is more relatedto functional connectivity changes in their de-pendence networks than to differential gene ex-pression The MYLK KLK2 KLK3 HAN11LTF CSRP1 and TGM4 genes presented signifi-cant changes in their functional connectivity be-tween normal and tumoral conditions and were

also classified as the top seven most informativegenes for the prostate cancer genesis process byour discriminant analysis Moreover among theidentified genes we found classically knownbiomarkers and genes which are closely relatedto tumoral prostate such as KLK3 and KLK2and several other potential ones We have dem-onstrated that changes in functional connectivitymay be implicit in the biological process whichrenders some genes more informative to dis-criminate between normal and tumoral condi-tions Using the proposed method namelyMLDA in order to analyze the multivariatecharacteristic of genes it was possible to capturethe changes in dependence networks which arerelated to cell transformation

j Rule-based reasoning for system dynam-ics in cell systems

Euna Jeong Masao Nagasaki Satoru Miyano

A system-dynamics-centered ontology calledthe Cell System Ontology (CSO) has been de-veloped for representation of diverse biologicalpathways Many of the pathway data based onthe ontology have been created from databasesvia data conversion or curated by expert biolo-gists It is essential to validate the pathway datawhich may cause unexpected issues such as se-mantic inconsistency and incompleteness Thispaper discusses three criteria for validating thepathway data based on CSO as follows (1)structurally correct models in terms of Petrinets (2) biologically correct models to capturebiological meaning and (3) systematically cor-rect models to reflect biological behaviors Si-multaneously we have investigated how logic-based rules can be used for the ontology to ex-tend its expressiveness and to complement theontology by reasoning which aims at qualifyingpathway knowledge Finally we show how theproposed approach helps exploring dynamicmodeling and simulation tasks without priorknowledge

k A novel strategy to search conserved tran-scription factor binding sites among coex-pressing genes in human

Yosuke Hatanaka Masao Nagasaki Rui Yam-aguchi Takeshi Obayashi Kazuyuki NumataAndreacute Fujita Teppei Shimamura YoshinoriTamada Seiya Imoto Kengo Kinoshita KentaNakai Satoru Miyano

We reported various transcription factor bind-ing sites (TFBSs) conserved among co-expressedgenes in human promoter region using expres-

125

sion and genomic data Assuming similar pro-moter structure induces similar transcriptionalregulation hence induces similar expressionprofile we compared the promoter structuresimilarities between co-expressed genes Com-prehensive TF binding site predictions for allhuman genes were conducted for 19777 pro-moter regions around the transcription start site(TSS) given from DBTSS and promoter similar-ity search were conducted among coexpressinggenes data provided from newly developedCOXPRESdb Combination of Position WeightMatrix (PWM) motif prediction and bootstrapmethod 7313 genes have at least one statisti-cally significant conserved TFBS We also ap-plied basket method analysis for seeking combi-natorial activities of those conserved TFBSs

l Simulation analysis for the effect of light-dark cycle on the entrainment in circadianrhythm

Natumi Mitou8 Yuto Ikegami8 Hiroshi Mat-suno8 Satoru Miyano Shin-ichi T Inouye88Yamaguchi University

Circadian rhythms of the living organisms are24hr oscillations found in behavior biochemistryand physiology Under constant conditions therhythms continue with their intrinsic periodlength which are rarely exact 24hr In this pa-per we examine the effects of light on the phaseof the gene expression rhythms derived fromthe interacting feedback network of a few clockgenes taking advantage of a computer simula-tion with Cell Illustrator The simulation resultssuggested that the interacting circadian feedbacknetwork at the molecular level is essential forphase dependence of the light effects observedin mammalian behavior Furthermore the simu-lation reproduced the biological observationsthat the range of entrainment to shorter orlonger than 24hr light-dark cycles is limitedcentering around 24hr Application of our modelto inter-time zone flight successfully demon-strated that 6 to 7 days are required to recoverfrom jet lag when traveling from Tokyo to NewYork

2 Statistical and Computational KnowledgeDiscovery

a Nonlinear regression modeling via regular-ized radial basis function networks

Tomohiro Ando9 Sadanori Konishi10 SeiyaImoto 9Keio University 10Kyushu University

The problem of constructing nonlinear regres-

sion models is investigated to analyze data withcomplex structure We introduced radial basisfunctions with hyperparameter that adjusts theamount of overlapping basis functions andadopts the information of the input and re-sponse variables By using the radial basis func-tions we constructed nonlinear regression mod-els with help of the technique of regularizationCrucial issues in the model building process arethe choices of a hyperparameter the number ofbasis functions and a smoothing parameter Wepresent information-theoretic criteria for evaluat-ing statistical models under model misspecifica-tion both for distributional and structural as-sumptions We used real data examples andMonte Carlo simulations to investigate the prop-erties of the proposed nonlinear regression mod-eling techniques The simulation results showedthat our nonlinear modeling performs well invarious situations and clear improvements wereobtained for the use of the hyperparameter inthe basis functions

b The GC and window-averaged DNA curva-ture profile of secondary metabolite genecluster in Aspergillus fumigatus genome

Jin Hwan Do Satoru Miyano

An immense variety of complex secondarymetabolites is produced by filamentous fungi in-cluding Aspergillus fumigatus a main inducer ofinvasive aspergillosis The identification of fun-gal secondary metabolite gene cluster is essen-tial for the characterization of fungal secondarymetabolism in terms of genetics and biochemis-try through recombinant technologies such asgene disruption and cloning Most of the predic-tion methods for secondary metabolite genecluster severely depend on homology searchesHowever homology-based approach has intrin-sic limitation to unknown or novel gene clusterWe analyzed the GC and window-averagedDNA curvature profile of 26 secondary metabo-lite gene clusters in the A fumigatus genome tofind out potential conserved features of secon-dary metabolite gene cluster Fifteen secondarymetabolite gene clusters showed a conservedpattern in window-averaged DNA curvatureprofile that is the DNA regions including sec-ondary metabolic signature genes such aspolyketide synthase nonribosomal peptide syn-thase andor dimethylallyl tryptophan synthaseconsisted of window-averaged DNA curvaturevalues lower than 018 and these DNA regionswere at least 20 kb Forty percent of secondarymetabolite gene clusters with this conserved pat-tern were related to severe regulation by a tran-scription factor LaeA Our result could be used

126

for identification of other fungal secondary me-tabolite gene clusters especially for secondarymetabolite gene cluster that is severely regulatedby LaeA or other proteins with similar functionto LaeA

c ExonMiner Web service for analysis ofGeneChip exon array data

Kazuyuki Numata Ryo Yoshida1 Masao Na-gasaki Ayumu Saito Seiya Imoto Satoru Miy-ano

Some splicing isoform-specific transcriptionalregulations are related to disease Therefore de-tection of disease specific splice variations is thefirst step for finding disease specific transcrip-tional regulations Affymetrix Human Exon 10ST Array can measure exon-level expressionprofiles that are suitable to find differentially ex-pressed exons in genome-wide scale Howeverexon array produces massive datasets that aremore than we can handle and analyze on per-sonal computer We have developed ExonMiner

that is the first all-in-one web service for analy-sis of exon array data to detect transcripts thathave significantly different splicing patterns intwo cells eg normal and cancer cells Exon-Miner can perform the following analyses (1)data normalization (2) statistical analysis basedon two-way ANOVA (3) finding transcriptswith significantly different splice patterns (4) ef-ficient visualization based on heatmaps and bar-plots and (5) meta-analysis to detect exon levelbiomarkers We implemented ExonMiner on thesupercomputer system of Human Genome Cen-ter in order to perform genome-wide analysisfor more than 300000 transcripts in exon arraydata which has the potential to reveal the aber-rant splice variations in cancer cells as exonlevel biomarkers ExonMiner is well suited foranalysis of exon array data and does not requireany installation of software except for internetbrowsers The URL of ExonMiner is httpaehgcjpexonminer Users can analyze full datasetof exon array data within hours by high-levelstatistical analysis with sound theoretical basisthat finds aberrant splice variants as biomarkers

Publications

1 Ando T Konishi S Imoto S Nonlinear re-gression modeling via regularized radial ba-sis function networks Journal of StatisticalPlanning and Inference 138 (11) 3616-36332008

2 Brazma A Miyano S Akutsu T Proceed-ings of the 6th Asia-Pacific BioinformaticsConference (APBC 2008) Imperial CollegePress 2008

3 Do JH Miyano S The GC and window-averaged DNA curvature profile of secon-dary metabolite gene cluster in Aspergillusfumigatus genome Applied Microbiologyand Biotechnology 80 (5) 841-847 2008

4 Fujita A Gomes LR Sato JR Yama-guchi R Thomaz CE Sogayar MC Miy-ano S Multivariate gene expression analysisreveals functional connectivity changes be-tween normaltumoral prostates BMC Sys-tems Biology 2 106 2008

5 Fujita A Sato JR Garay-Malpartida HM Sogayar MC Ferreira CE Miyano SModeling nonlinear gene regulatory net-works from time series gene expressiondata J Bioinformatics and ComputationalBiology 6 (5) 961-979 2008

6 Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Fujita A Shima-mura T Tamada Y Imoto S KinoshitaK Nakai K Miyano S A novel strategy tosearch conserved transcription factor bind-

ing sites among coexpressing genes in hu-man Genome Informatics 20 212-221 2008

7 Hirose O Yoshida R Imoto S Yama-guchi R Higuchi T Charnock-Jones DSPrint C Miyano S Statistical inference oftranscriptional module-based gene networksfrom time course gene expression profiles byusing state space models Bioinformatics 24(7) 932-942 2008

8 Hirose O Yoshida R Yamaguchi RImoto S Higuchi T Miyano S Analyzingtime course gene expression data with bio-logical and technical replicates to estimategene networks by state space models Proc2nd Asia International Conference on Mod-elling amp Simulation 940-946 2008 (AMS2008 Refereed conference)

9 Jeong E Nagasaki M Miyano S Rule-based reasoning for system dynamics in cellsystems Genome Informatics 20 25-362008

10 Kitakaze H Kanda M Nakatsuka HIkeda N Matsuno H Miyano S Predic-tion of fragile points for robustness checkingof cell systems IEICE TRANSACTIONS onInformation and Systems D J91-D (9) 2404-2417 2008

11 Knapp E-W Benson G Holzhutter H-GKanehisa M Miyano S (Eds) Genome In-formatics 20 2008

12 Kojima K Fujita A Shimamura T Imoto

127

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 4: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

10 Linear-time protein 3-D structure search-ing algorithm

Tetsuo Shibuya

Finding similar structures from 3-D structuredatabases of proteins (or other molecules) is be-coming one of the most important issues in thepost-genomic molecular biology To compare 3-D structures of two molecules biologists mostlyuse the RMSD (root mean square deviation) asthe similarity measure The RMSD is one of themost fundamental similarity measures used invarious fields such as computer vision and ro-botics for comparing two sets of coordinates Inthis research we propose new theoretically andpractically fast algorithms for the basic problemof finding all the substructures of structures in astructure database of chain molecules (such asproteins) whose RMSDs to the query are withina given constant threshold The best-knownworst-case time complexity for the problem is O(N log m) where N is the database size and mis the query size The previous best-known ex-pected time complexity for the problem is alsoO (N log m) In this research we propose a newbreakthrough linear-expected-time algorithm Itis not only a theoretically significant improve-ment over previous algorithms but also a prac-tically faster algorithm according to computa-tional experiments We also propose a series ofpreprocessing algorithms that enable faster que-ries though there have been no known indexingalgorithm whose query time complexity is betterthan the above O (N log m) bound One is an O(N log2 N )-time and O (N log N )-space pre-processing algorithm with expected query timecomplexity of O (m+N m 05) Another is an O (Nlog N )-time and O (N )-space preprocessing algo-rithm with expected query time complexity of O(N m 05+m log (N m))

We also extend the above linear-time algo-rithm into an algorithm with expected querytime complexity of O (m+N m 1-ε) where ε isan arbitrary small constant such that 0<ε<1We furthermore extend the above linear-time al-gorithm so that it can deal with insertions anddeletions

We checked the performance of our linear-expected-time algorithm through computationalexperiments over the whole PDB database Theexperiments show that our algorithm is muchfaster than the previous algorithms For exam-ple our algorithm is 36 to 28 times faster thanpreviously known algorithms to search for simi-lar substructures whose RMSDs are within 1Åto queries of ordinary lengths The experimentsalso show that there is consistency between theabove theoretical results and the experimental

results In other words the actual computationtime of our linear-expected-time algorithm is notinfluenced by the difference of query lengths incontrast to previous algorithms

11 Fast hinge detection algorithm in proteinstructures

Tetsuo Shibuya

Analysis of conformational changes is one ofthe keys to the understanding of protein func-tions and interactions For the analysis we oftencompare two protein structures taking flexibleregions like hinge domains into considerationThe RMSD (Root Mean Square Deviation) is themost popular measure for comparing two pro-tein structures but it is only for rigid structureswithout hinge domains In this research we pro-pose a new measure called RMSDh (Root MeanSquare Deviation considering hinges) and itsvariant RMSDh(k) for comparing two flexibleproteins with hinge domains We also proposenovel efficient algorithms for computing themwhich can detect the hinge positions at the sametime The RMSDh is suitable for cases wherethere is one small hinge domain in each of thetwo target structures The new algorithm forcomputing the RMSDh runs in linear timewhich is same as the time complexity for com-puting the RMSD and is faster than any of pre-vious algorithms for hinge detection TheRMSDh(k) is designed for comparing structureswith more than one hinge domain The RMSDh(k) measure considers at most k small hinge do-mains ie the RMSDh(k) value should be smallif the two structures are similar except for atmost k hinge domains To compute the valuewe propose an O (kn 2)-time and O (n)-space al-gorithm based on a new dynamic programmingtechnique We also test our measures againstboth flexible protein structures and non-flexibleprotein structures and show that the hinge po-sitions can be correctly detected by our algo-rithms

12 Fast flexible protein structure alignment

Kohichi Suematsu and Tetsuo Shibuya

The Hinge Detection Algorithm described insection 11 only considered rigid hinge pointsbut the hinges are sometimes bends a little by it-self which sometimes leads to inaccurate pre-diction of hinge positions Thus we incorporatedthe notion lsquobending hingersquo to detect such hingepositions We developed a very efficient heuris-tic algorithm for finding such bending hinges asthe exact algorithm for this problem requires ex-

119

ponential time For the algorithm we developeda detailed score matrix for comparing localstructures based on the naiumlve Bayse learning

13 Protein function prediction based on 3-Dstructure motifs

Chia-Han Chu Hiroki Sakai and TetsuoShibuya

Protein functions are said to be determined byits 3-D structures but not all functions havebeen known to be related to some 3-D structuremotifs The geometric suffix tree a data struc-ture for indexing 3-D protein structures whichis also developed by us enables comprehen-sively enumeration of all the possible structuralmotifs among given set of proteins We are de-veloping a new algorithm based on the supportvector machine that decides proteinrsquos functionfrom the 3-D structure of a protein This algo-rithm utilizes all the possible 3-D motifs foundby using the geometric suffix tree

14 Suffix array construction with a lazyscheme

Ben Hachimori and Tetsuo Shibuya

The suffix array is one of the most importantindexing data structures for alphabet strings in-cluding DNA sequences RNA sequences pro-tein sequences web pages Medline databaseand so on But even the most sophisticated algo-rithm for constructing the suffix array requires alot of time We developed a new efficient lazyalgorithm that computes the suffix array onlyafter we get the query By doing so we have tocompute only the necessary part of the suffix ar-ray We developed a lazy algorithm based onthe Schurmann-Stoye algorithm which is moreefficient than both Boyer-Moore algorithm andother suffix tree-based algorithms in case thenumber of queries is limited

15 Color space-DNA sequence mappingalignment algorithm

Ben Hachimori and Tetsuo Shibuya

Applied Biosystemsrsquos SOLiD system encodethe DNA sequence into a sequence of data typecalled the color space where one of 4 fluores-cent colors is assigned to each two adjacentbasersquos 16 pattern orderings However therehave been known no algorithm that alignsmaps the color-space sequence to the DNA se-quence with consideration of the difference be-tween the experimental error and the actual mu-tation We developed an alignment algorithmthat distinguishes the experimental error and ac-tual DNA mutation to align the color-space dataagainst ordinary DNA sequences Moreover wecomputed the optimal score table for the align-ment based on the actual E coli data

16 Genotype clustering based on hiddenMarkov models

Ritsuko Onuki Tetsuo Shibuya and MinoruKanehisa

Haplotype clustering is important for genemapping of human disease Although its impor-tance for the analysis it is difficult to obtainhaplotype data from present experiment for itscost and error rate Instead of haplotypes geno-types are much easier to obtain In this workwe propose a new method for clustering geno-types In the algorithm we first infer the multi-ple haplotype candidates from the genotypeand next we calculate the distance between thegenotypes based on the results of the haplotypeinference Then we perform genotype clusteringbased on the distances We evaluated our algo-rithm by applying our algorithm against severalactual genotype data

Publications

Kanehisa M Araki M Goto S Hattori MHirakawa M Itoh M Katayama TKawashima S Okuda S Tokimatsu T andYamanishi Y KEGG for linking genomes tolife and the environment Nucleic Acids Re-search 36 D480-D484 2008

Kawashima S Pokarowski P Pokarowska MKolinski A Katayama T Kanehisa MAAindex amino acid index database progressreport 2008 Nucleic Acids Research 36 D202-D205 2008

Okuda S Yamada T Hamajima M Itoh MKatayama T Bork P Goto S and KanehisaM KEGG Atlas mapping for global analysisof metabolic pathways Nucleic Acids Research36 W423-426 2008

Wakaguri H Suzuki Y Katayama TKawashima S Kibukawa E Hiranuka KSasaki M Sugano S and Watanabe J Full-MalariaParasites and Full-Arthropods data-base of full-length cDNAs of parasites and ar-thropods update 2009 Nucleic Acids Research

120

37 D520-D525 2008Yamanishi Y Araki M Gutteridge A Honda

W and Kanehisa M Prediction of drug-targetinteraction networks from the integration ofchemical and genomic spaces Bioinformatics24 i232-i240 2008

Takarabe M Okuda S Itoh M Tokimatsu TGoto S and Kanehisa M Network analysisof adverse drug interactions Genome Informat-ics 20 252-259 2008

Hashimoto K Yoshizawa AC Okuda SKuma K Goto S and Kanehisa M The rep-ertoire of desaturases and elongases revealsfatty acid variations in 56 eukaryotic genomesJ Lipid Res 49 183-191 (2008)

Shibuya T Fast Hinge Detection Algorithmsfor Flexible Protein Structures IEEEACM

Transactions on Computational Biology and Bioin-formatics to appear

Shibuya T Searching Protein 3-D Structures inLinear Time Proc 13th Annual InternationalConference on Research in Computational Molecu-lar Biology (RECOMB 2009) 2009 to appear

Shibuya T Linear-Time Algorithm for Search-ing Protein 3-D Structures IPSJ SIG Notes SI-GAL 123-4 2009 to appear

Suematsu K Shibuya T Flexible ProteinAlignment of 3D-Structures Allowing Dy-namic Transformation ISPSJ SIG Notes SIG-BIO 12-12 2008 pp 87-94本多渉田辺麻央矢野亜津子金久實バイオインフォマティクスシステムバイオロジーとKEGG生化学801094―11112008

121

The recent advances in biomedical research have been producing large-scaleultra-high dimensional ultra-heterogeneous data Due to these post-genomic re-search progresses our current mission is to create computational strategy for sys-tems biology and medicine towards translational bioinformatics With this missionwe have been developing computational methods for understanding life as systemand applying them to practical issues in medicine and biology

1 Computational Systems Biology

a Systematic reconstruction of TRANSPATHdata into Cell System Markup Language

Masao Nagasaki Ayumu Saito Chen Li EunaJeong Satoru Miyano

Many biological repositories store informationbased on experimental study of the biologicalprocesses within a cell such as protein-proteininteractions metabolic pathways signal trans-duction pathways or regulations of transcrip-tion factors and miRNA Unfortunately it is dif-ficult to directly use such information whengenerating simulation-based models Thus mod-eling rules for encoding biological knowledgeinto system-dynamics-oriented standardized for-mats would be very useful for full understand-ing of cellular dynamics at the system level Weselected the TRANSPATH database a manuallycurated high-quality pathway database whichprovides a rich source of cellular events in hu-mans mice and rats curated from over 31500papers In this work we defined 16 modeling

rules based on hybrid functional Petri net withextension (HFPNe) which is suitable for graphi-cal representation and simulation of biologicalprocesses In these modeling rules each Petrinet element is incorporated with Cell SystemOntology (CSO) to enable semantic interoper-ability of models As a formal ontology for bio-logical pathway modeling with dynamics CSOalso defines biological terminology and corre-sponding icons By combining HFPNe with theCSO features we made a method for transforming TRANSPATH data to simulation-based se-mantically valid models The results are en-coded into a biological pathway format CellSystem Markup Language (CSML) which easesthe exchange and integration of biological dataand models By using the 16 modeling rules97 of the reactions in TRANSPATH are con-verted into simulation-based models representedin CSML This reconstruction demonstrated thatit is possible to use our rules to generate quanti-tative models from static pathway descriptions

b Finding optimal Bayesian network given asuper-structure

Human Genome Center

Laboratory of DNA Information AnalysisDNA情報解析分野

Professor Satoru Miyano PhDAssociate Professor Seiya Imoto PhDAssistant Professor Masao Nagasaki PhDProject Lecturer Rui Yamaguchi PhDProject AssistantProfessor Yoshinori Tamada PhD

教 授 理学博士 宮 野 悟准教授 博士(数理学) 井 元 清 哉助 教 博士(理学) 長 正 朗特任講師 博士(理学) 山 口 類特任助教 博士(情報学) 玉 田 嘉 紀

122

Eric Perrier Seiya Imoto Satoru Miyano

Conventional approaches for learning Baye-sian network structure from data have disad-vantages in terms of complexity and lower accu-racy of their results However a recent empiri-cal study has shown that a hybrid algorithm im-proves sensitively accuracy and speed it learnsa skeleton with an independency test (IT) ap-proach and constrains on the directed acyclicgraphs considered during the search-and-scorephase Subsequently we defined the structuralconstraint by introducing the concept of super-structure S which is an undirected graph thatrestricts the search to networks whose skeletonis a subgraph of S We developed a super-structure constrained optimal search (COS) itstime complexity is upper bounded by O(γm

n)where γm<2 depends on the maximal degree mof S Empirically complexity depends on theaverage degree mrsquo and sparse structures allowlarger graphs to be calculated Our algorithm isfaster than an optimal search by several ordersand even finds more accurate results whengiven a sound super-structure Practically S canbe approximated by IT approaches significancelevel of the tests controls its sparseness enablingto control the trade-off between speed and accu-racy For incomplete super-structures a greedilypost-processed version (COS+) still enables tosignificantly outperform other heuristic searches

c Statistical inference of transcriptionalmodule-based gene networks from timecourse gene expression profiles by usingstate space models

Osamu Hirose Ryo Yoshida1 Seiya Imoto RuiYamaguchi Tomoyuki Higuchi1 D StephenCharnock-Jones2 Cristin Print3 Satoru Miy-ano 1Institute of Statistical Mathematics 2Cambridge University 3University of Auck-land

We developed a novel method based on thestate space model to identify the transcriptionalmodules and module-based gene networks si-multaneously The state space model has the po-tential to infer large-scale gene networks eg oforder 103 from time-course gene expression pro-files Particularly we succeeded in identificationof a cell cycle system by using the gene expres-sion profiles of Saccharomyces cerevisiae in whichthe length of the time-course and number ofgenes were 24 and 4382 respectively Howeverwhen analyzing shorter time-course data eg oflength 10 or less the parameter estimations ofthe state space model often fail due to overfit-ting To extend the applicability of the state

space model we provided an approach to usethe technical replicates of gene expression pro-files which are often measured in duplicate ortriplicate The use of technical replicates is im-portant for achieving highly-efficient inferenceof gene networks with short time-course dataThe potential of the proposed method weredemonstrated through the time-course analysisof the gene expression profiles of human umbili-cal vein endothelial cells undergoing growthfactor deprivation-induced apoptosis

d Predicting differences in gene regulatorysystems by state space models

Rui Yamaguchi Seiya Imoto Mai YamauchiMasao Nagasaki Ryo Yoshida1 Teppei Shima-mura Yosuke Hatanaka Kazuko Ueno To-moyuki Higuchi1 Noriko Gotoh Satoru Miy-ano

We developed a statistical method to predictdifferentially regulated genes of case and controlsamples from time-course gene expression databy leveraging unpredictability of the expressionpatterns from the underlying regulatory systeminferred by a state space model The proposedmethod can screen out genes that show differentpatterns but generated by the same regulationsin both samples since these patterns can be pre-dicted by the same model Our strategy consistsof three steps Firstly a gene regulatory systemis inferred from the control data by a state spacemodel Then the obtained model for the under-lying regulatory system of the control sample isused to predict the case data Finally by assess-ing the significance of the difference betweencase and predicted-case time-course data of eachgene we are able to detect the unpredictablegenes that are the candidate as the key differ-ences between the regulatory systems of caseand control cells We illustrate the whole proc-ess of the strategy by an actual example wherehuman small airway epithelial cell gene regula-tory systems were generated from novel timecourses of gene expressions following treatmentwith(case)without(control) the drug gefitiniban inhibitor for the epidermal growth factor re-ceptor tyrosine kinase Finally in gefitinib re-sponse data we succeeded in finding unpredict-able genes that are candidates of the specific tar-gets of gefitinib We also discussed differencesin regulatory systems for the unpredictablegenes The proposed method would be a prom-ising tool for identifying biomarkers and drugtarget genes

e Bayesian learning of biological pathwayson genomic data assimilation

123

Ryo Yoshida1 Masao Nagasaki Rui Yama-guchi Seiya Imoto Satoru Miyano TomoyukiHiguchi1

Mathematical modeling and simulation basedon biochemical rate equations provide us a rig-orous tool for unraveling complex mechanismsof biological pathways To proceed to simulationexperiments it is an essential first step to findeffective values of model parameters which aredifficult to measure from in vivo and in vitro ex-periments Furthermore once a set of hypotheti-cal models has been created any statistical crite-rion is needed to test the ability of the con-structed models and to proceed to model revi-sion We developed a new statistical technologytowards data-driven construction of in silico bio-logical pathways The method starts with aknowledge-based modeling with hybrid func-tional Petri net It then proceeds to the Bayesianlearning of model parameters for which experi-mental data are available This process exploitsquantitative measurements of evolving bio-chemical reactions eg gene expression dataAnother important issue that we consider is sta-tistical evaluation and comparison of the con-structed hypothetical pathways For this pur-pose we have developed a new Bayesianinformation-theoretic measure that assesses thepredictability and the biological robustness of insilico pathways

f Modeling nonlinear gene regulatory net-works from time series gene expressiondata

Andreacute Fujita Joatildeo Ricardo Sato5 HumbertoMiguel Garay-Malpartida5 Mari CleideSogayar5 Carlow Eduardo Ferreira5 SatoruMiyano 5University of Satildeo Paulo

In cells molecular networks such as generegulatory networks are the basis of biologicalcomplexity Therefore gene regulatory networkshave become the core of research in systems bi-ology Understanding the processes underlyingthe several extracellular regulators signal trans-duction protein-protein interactions and differ-ential gene expression processes requires de-tailed molecular description of the protein andgene networks involved To understand betterthese complex molecular networks and to infernew regulatory associations we developed astatistical method based on vector autoregres-sive models and Granger causality to estimatenonlinear gene regulatory networks from timeseries microarray data Most of the modelsavailable in the literature assume linearity in theinference of gene connections moreover these

models do not infer directionality in these con-nections Thus a priori biological knowledge isrequired However in pathological cases no apriori biological information is available Toovercome these problems we present the non-linear vector autoregressive (NVAR) model Wehave applied the NVAR model to estimate non-linear gene regulatory networks based entirelyon gene expression profiles obtained from DNAmicroarray experiments We showed the resultsobtained by NVAR through several simulationsand by the construction of three actual generegulatory networks (p53 NF-κB and c-Myc)for HeLa cells

g Fast grid layout algorithm for biologicalnetworks with sweep calculation

Kaname Kojima Masao Nagasaki Satoru Miy-ano

Properly drawn biological networks are ofgreat help in the comprehension of their charac-teristics The quality of the layouts for retrievedbiological networks is critical for pathway data-bases However since it is unrealistic to manu-ally draw biological networks for every re-trieval automatic drawing algorithms are essen-tial Grid layout algorithms handle various bio-logical properties such as aligning vertices hav-ing the same attributes and complicated posi-tional constraints according to their subcellularlocalizations thus they succeed in providingbiologically comprehensible layouts Howeverexisting grid layout algorithms are not suitablefor real-time drawing which is one of requisitesfor applications to pathway databases due totheir high-computational cost In addition theydo not consider edge directions and their result-ing layouts lack traceability for biochemical re-actions and gene regulations which are themost important features in biological networksWe devised a new calculation method termedsweep calculation and reduced the time com-plexity of the current grid layout algorithmsthrough its encoding and decoding processesWe conduct ed practical experiments by using95 pathway models of various sizes fromTRANSPATH and showed that our new gridlayout algorithm is much faster than existinggrid layout algorithms For the cost function weintroduced a new component that penalizes un-desirable edge directions to avoid the lack oftraceability in pathways due to the differencesin direction between in-edges and out-edges ofeach vertex

124

h Estimation of nonlinear gene regulatorynetworks via L1 regularized NVAR fromtime series gene expression data

Kaname Kojima Andreacute Fujita Teppei Shima-mura Seiya Imoto Satoru Miyano

Recently nonlinear vector autoregressive(NVAR) model based on Granger causality wasproposed to infer nonlinear gene regulatory net-works from time series gene expression dataSince NVAR requires a large number of parame-ters due to the basis expansion the length oftime series microarray data is insufficient for ac-curate parameter estimation and we need tolimit the size of the gene set strongly To ad-dress this limitation we employed L1 regulariza-tion technique to estimate NVAR Under L1

regularization direct parents of each gene canbe selected efficiently even when the number ofparameters exceeds the number of data samplesWe can thus estimate larger gene regulatory net-works more accurately than those from existingmethods Through the simulation study weverified the effectiveness of the proposedmethod by comparing its limitation in the num-ber of genes to that of the existing NVAR Theproposed method was also applied to time se-ries microarray data of Human hela cell cycle

i Multivariate gene expression analysis re-veals functional connectivity changes be-tween normaltumoral prostates

Andreacute Fujita Luciana Rodrigues Gomes5 JoatildeoRicardo Sato6 Rui Yamaguchi Carlos Edu-ardo Thomaz7 Mari Cleide Sogayar5 SatoruMiyano 6Universidade Federal do ABC 7Cen-tro Universitaacuterio da FEI

Principal Component Analysis (PCA) com-bined with the Maximum-entropy Linear Dis-criminant Analysis (MLDA) was applied in or-der to identify genes with the most discrimina-tive information between normal and tumoralprostatic tissues Data analysis was carried outusing three different approaches namely (i) dif-ferences in gene expression levels between nor-mal and tumoral conditions from a univariatepoint of view (ii) in a multivariate fashion usingMLDA and (iii) with a dependence network ap-proach Our results show that malignant trans-formation in the prostatic tissue is more relatedto functional connectivity changes in their de-pendence networks than to differential gene ex-pression The MYLK KLK2 KLK3 HAN11LTF CSRP1 and TGM4 genes presented signifi-cant changes in their functional connectivity be-tween normal and tumoral conditions and were

also classified as the top seven most informativegenes for the prostate cancer genesis process byour discriminant analysis Moreover among theidentified genes we found classically knownbiomarkers and genes which are closely relatedto tumoral prostate such as KLK3 and KLK2and several other potential ones We have dem-onstrated that changes in functional connectivitymay be implicit in the biological process whichrenders some genes more informative to dis-criminate between normal and tumoral condi-tions Using the proposed method namelyMLDA in order to analyze the multivariatecharacteristic of genes it was possible to capturethe changes in dependence networks which arerelated to cell transformation

j Rule-based reasoning for system dynam-ics in cell systems

Euna Jeong Masao Nagasaki Satoru Miyano

A system-dynamics-centered ontology calledthe Cell System Ontology (CSO) has been de-veloped for representation of diverse biologicalpathways Many of the pathway data based onthe ontology have been created from databasesvia data conversion or curated by expert biolo-gists It is essential to validate the pathway datawhich may cause unexpected issues such as se-mantic inconsistency and incompleteness Thispaper discusses three criteria for validating thepathway data based on CSO as follows (1)structurally correct models in terms of Petrinets (2) biologically correct models to capturebiological meaning and (3) systematically cor-rect models to reflect biological behaviors Si-multaneously we have investigated how logic-based rules can be used for the ontology to ex-tend its expressiveness and to complement theontology by reasoning which aims at qualifyingpathway knowledge Finally we show how theproposed approach helps exploring dynamicmodeling and simulation tasks without priorknowledge

k A novel strategy to search conserved tran-scription factor binding sites among coex-pressing genes in human

Yosuke Hatanaka Masao Nagasaki Rui Yam-aguchi Takeshi Obayashi Kazuyuki NumataAndreacute Fujita Teppei Shimamura YoshinoriTamada Seiya Imoto Kengo Kinoshita KentaNakai Satoru Miyano

We reported various transcription factor bind-ing sites (TFBSs) conserved among co-expressedgenes in human promoter region using expres-

125

sion and genomic data Assuming similar pro-moter structure induces similar transcriptionalregulation hence induces similar expressionprofile we compared the promoter structuresimilarities between co-expressed genes Com-prehensive TF binding site predictions for allhuman genes were conducted for 19777 pro-moter regions around the transcription start site(TSS) given from DBTSS and promoter similar-ity search were conducted among coexpressinggenes data provided from newly developedCOXPRESdb Combination of Position WeightMatrix (PWM) motif prediction and bootstrapmethod 7313 genes have at least one statisti-cally significant conserved TFBS We also ap-plied basket method analysis for seeking combi-natorial activities of those conserved TFBSs

l Simulation analysis for the effect of light-dark cycle on the entrainment in circadianrhythm

Natumi Mitou8 Yuto Ikegami8 Hiroshi Mat-suno8 Satoru Miyano Shin-ichi T Inouye88Yamaguchi University

Circadian rhythms of the living organisms are24hr oscillations found in behavior biochemistryand physiology Under constant conditions therhythms continue with their intrinsic periodlength which are rarely exact 24hr In this pa-per we examine the effects of light on the phaseof the gene expression rhythms derived fromthe interacting feedback network of a few clockgenes taking advantage of a computer simula-tion with Cell Illustrator The simulation resultssuggested that the interacting circadian feedbacknetwork at the molecular level is essential forphase dependence of the light effects observedin mammalian behavior Furthermore the simu-lation reproduced the biological observationsthat the range of entrainment to shorter orlonger than 24hr light-dark cycles is limitedcentering around 24hr Application of our modelto inter-time zone flight successfully demon-strated that 6 to 7 days are required to recoverfrom jet lag when traveling from Tokyo to NewYork

2 Statistical and Computational KnowledgeDiscovery

a Nonlinear regression modeling via regular-ized radial basis function networks

Tomohiro Ando9 Sadanori Konishi10 SeiyaImoto 9Keio University 10Kyushu University

The problem of constructing nonlinear regres-

sion models is investigated to analyze data withcomplex structure We introduced radial basisfunctions with hyperparameter that adjusts theamount of overlapping basis functions andadopts the information of the input and re-sponse variables By using the radial basis func-tions we constructed nonlinear regression mod-els with help of the technique of regularizationCrucial issues in the model building process arethe choices of a hyperparameter the number ofbasis functions and a smoothing parameter Wepresent information-theoretic criteria for evaluat-ing statistical models under model misspecifica-tion both for distributional and structural as-sumptions We used real data examples andMonte Carlo simulations to investigate the prop-erties of the proposed nonlinear regression mod-eling techniques The simulation results showedthat our nonlinear modeling performs well invarious situations and clear improvements wereobtained for the use of the hyperparameter inthe basis functions

b The GC and window-averaged DNA curva-ture profile of secondary metabolite genecluster in Aspergillus fumigatus genome

Jin Hwan Do Satoru Miyano

An immense variety of complex secondarymetabolites is produced by filamentous fungi in-cluding Aspergillus fumigatus a main inducer ofinvasive aspergillosis The identification of fun-gal secondary metabolite gene cluster is essen-tial for the characterization of fungal secondarymetabolism in terms of genetics and biochemis-try through recombinant technologies such asgene disruption and cloning Most of the predic-tion methods for secondary metabolite genecluster severely depend on homology searchesHowever homology-based approach has intrin-sic limitation to unknown or novel gene clusterWe analyzed the GC and window-averagedDNA curvature profile of 26 secondary metabo-lite gene clusters in the A fumigatus genome tofind out potential conserved features of secon-dary metabolite gene cluster Fifteen secondarymetabolite gene clusters showed a conservedpattern in window-averaged DNA curvatureprofile that is the DNA regions including sec-ondary metabolic signature genes such aspolyketide synthase nonribosomal peptide syn-thase andor dimethylallyl tryptophan synthaseconsisted of window-averaged DNA curvaturevalues lower than 018 and these DNA regionswere at least 20 kb Forty percent of secondarymetabolite gene clusters with this conserved pat-tern were related to severe regulation by a tran-scription factor LaeA Our result could be used

126

for identification of other fungal secondary me-tabolite gene clusters especially for secondarymetabolite gene cluster that is severely regulatedby LaeA or other proteins with similar functionto LaeA

c ExonMiner Web service for analysis ofGeneChip exon array data

Kazuyuki Numata Ryo Yoshida1 Masao Na-gasaki Ayumu Saito Seiya Imoto Satoru Miy-ano

Some splicing isoform-specific transcriptionalregulations are related to disease Therefore de-tection of disease specific splice variations is thefirst step for finding disease specific transcrip-tional regulations Affymetrix Human Exon 10ST Array can measure exon-level expressionprofiles that are suitable to find differentially ex-pressed exons in genome-wide scale Howeverexon array produces massive datasets that aremore than we can handle and analyze on per-sonal computer We have developed ExonMiner

that is the first all-in-one web service for analy-sis of exon array data to detect transcripts thathave significantly different splicing patterns intwo cells eg normal and cancer cells Exon-Miner can perform the following analyses (1)data normalization (2) statistical analysis basedon two-way ANOVA (3) finding transcriptswith significantly different splice patterns (4) ef-ficient visualization based on heatmaps and bar-plots and (5) meta-analysis to detect exon levelbiomarkers We implemented ExonMiner on thesupercomputer system of Human Genome Cen-ter in order to perform genome-wide analysisfor more than 300000 transcripts in exon arraydata which has the potential to reveal the aber-rant splice variations in cancer cells as exonlevel biomarkers ExonMiner is well suited foranalysis of exon array data and does not requireany installation of software except for internetbrowsers The URL of ExonMiner is httpaehgcjpexonminer Users can analyze full datasetof exon array data within hours by high-levelstatistical analysis with sound theoretical basisthat finds aberrant splice variants as biomarkers

Publications

1 Ando T Konishi S Imoto S Nonlinear re-gression modeling via regularized radial ba-sis function networks Journal of StatisticalPlanning and Inference 138 (11) 3616-36332008

2 Brazma A Miyano S Akutsu T Proceed-ings of the 6th Asia-Pacific BioinformaticsConference (APBC 2008) Imperial CollegePress 2008

3 Do JH Miyano S The GC and window-averaged DNA curvature profile of secon-dary metabolite gene cluster in Aspergillusfumigatus genome Applied Microbiologyand Biotechnology 80 (5) 841-847 2008

4 Fujita A Gomes LR Sato JR Yama-guchi R Thomaz CE Sogayar MC Miy-ano S Multivariate gene expression analysisreveals functional connectivity changes be-tween normaltumoral prostates BMC Sys-tems Biology 2 106 2008

5 Fujita A Sato JR Garay-Malpartida HM Sogayar MC Ferreira CE Miyano SModeling nonlinear gene regulatory net-works from time series gene expressiondata J Bioinformatics and ComputationalBiology 6 (5) 961-979 2008

6 Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Fujita A Shima-mura T Tamada Y Imoto S KinoshitaK Nakai K Miyano S A novel strategy tosearch conserved transcription factor bind-

ing sites among coexpressing genes in hu-man Genome Informatics 20 212-221 2008

7 Hirose O Yoshida R Imoto S Yama-guchi R Higuchi T Charnock-Jones DSPrint C Miyano S Statistical inference oftranscriptional module-based gene networksfrom time course gene expression profiles byusing state space models Bioinformatics 24(7) 932-942 2008

8 Hirose O Yoshida R Yamaguchi RImoto S Higuchi T Miyano S Analyzingtime course gene expression data with bio-logical and technical replicates to estimategene networks by state space models Proc2nd Asia International Conference on Mod-elling amp Simulation 940-946 2008 (AMS2008 Refereed conference)

9 Jeong E Nagasaki M Miyano S Rule-based reasoning for system dynamics in cellsystems Genome Informatics 20 25-362008

10 Kitakaze H Kanda M Nakatsuka HIkeda N Matsuno H Miyano S Predic-tion of fragile points for robustness checkingof cell systems IEICE TRANSACTIONS onInformation and Systems D J91-D (9) 2404-2417 2008

11 Knapp E-W Benson G Holzhutter H-GKanehisa M Miyano S (Eds) Genome In-formatics 20 2008

12 Kojima K Fujita A Shimamura T Imoto

127

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 5: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

ponential time For the algorithm we developeda detailed score matrix for comparing localstructures based on the naiumlve Bayse learning

13 Protein function prediction based on 3-Dstructure motifs

Chia-Han Chu Hiroki Sakai and TetsuoShibuya

Protein functions are said to be determined byits 3-D structures but not all functions havebeen known to be related to some 3-D structuremotifs The geometric suffix tree a data struc-ture for indexing 3-D protein structures whichis also developed by us enables comprehen-sively enumeration of all the possible structuralmotifs among given set of proteins We are de-veloping a new algorithm based on the supportvector machine that decides proteinrsquos functionfrom the 3-D structure of a protein This algo-rithm utilizes all the possible 3-D motifs foundby using the geometric suffix tree

14 Suffix array construction with a lazyscheme

Ben Hachimori and Tetsuo Shibuya

The suffix array is one of the most importantindexing data structures for alphabet strings in-cluding DNA sequences RNA sequences pro-tein sequences web pages Medline databaseand so on But even the most sophisticated algo-rithm for constructing the suffix array requires alot of time We developed a new efficient lazyalgorithm that computes the suffix array onlyafter we get the query By doing so we have tocompute only the necessary part of the suffix ar-ray We developed a lazy algorithm based onthe Schurmann-Stoye algorithm which is moreefficient than both Boyer-Moore algorithm andother suffix tree-based algorithms in case thenumber of queries is limited

15 Color space-DNA sequence mappingalignment algorithm

Ben Hachimori and Tetsuo Shibuya

Applied Biosystemsrsquos SOLiD system encodethe DNA sequence into a sequence of data typecalled the color space where one of 4 fluores-cent colors is assigned to each two adjacentbasersquos 16 pattern orderings However therehave been known no algorithm that alignsmaps the color-space sequence to the DNA se-quence with consideration of the difference be-tween the experimental error and the actual mu-tation We developed an alignment algorithmthat distinguishes the experimental error and ac-tual DNA mutation to align the color-space dataagainst ordinary DNA sequences Moreover wecomputed the optimal score table for the align-ment based on the actual E coli data

16 Genotype clustering based on hiddenMarkov models

Ritsuko Onuki Tetsuo Shibuya and MinoruKanehisa

Haplotype clustering is important for genemapping of human disease Although its impor-tance for the analysis it is difficult to obtainhaplotype data from present experiment for itscost and error rate Instead of haplotypes geno-types are much easier to obtain In this workwe propose a new method for clustering geno-types In the algorithm we first infer the multi-ple haplotype candidates from the genotypeand next we calculate the distance between thegenotypes based on the results of the haplotypeinference Then we perform genotype clusteringbased on the distances We evaluated our algo-rithm by applying our algorithm against severalactual genotype data

Publications

Kanehisa M Araki M Goto S Hattori MHirakawa M Itoh M Katayama TKawashima S Okuda S Tokimatsu T andYamanishi Y KEGG for linking genomes tolife and the environment Nucleic Acids Re-search 36 D480-D484 2008

Kawashima S Pokarowski P Pokarowska MKolinski A Katayama T Kanehisa MAAindex amino acid index database progressreport 2008 Nucleic Acids Research 36 D202-D205 2008

Okuda S Yamada T Hamajima M Itoh MKatayama T Bork P Goto S and KanehisaM KEGG Atlas mapping for global analysisof metabolic pathways Nucleic Acids Research36 W423-426 2008

Wakaguri H Suzuki Y Katayama TKawashima S Kibukawa E Hiranuka KSasaki M Sugano S and Watanabe J Full-MalariaParasites and Full-Arthropods data-base of full-length cDNAs of parasites and ar-thropods update 2009 Nucleic Acids Research

120

37 D520-D525 2008Yamanishi Y Araki M Gutteridge A Honda

W and Kanehisa M Prediction of drug-targetinteraction networks from the integration ofchemical and genomic spaces Bioinformatics24 i232-i240 2008

Takarabe M Okuda S Itoh M Tokimatsu TGoto S and Kanehisa M Network analysisof adverse drug interactions Genome Informat-ics 20 252-259 2008

Hashimoto K Yoshizawa AC Okuda SKuma K Goto S and Kanehisa M The rep-ertoire of desaturases and elongases revealsfatty acid variations in 56 eukaryotic genomesJ Lipid Res 49 183-191 (2008)

Shibuya T Fast Hinge Detection Algorithmsfor Flexible Protein Structures IEEEACM

Transactions on Computational Biology and Bioin-formatics to appear

Shibuya T Searching Protein 3-D Structures inLinear Time Proc 13th Annual InternationalConference on Research in Computational Molecu-lar Biology (RECOMB 2009) 2009 to appear

Shibuya T Linear-Time Algorithm for Search-ing Protein 3-D Structures IPSJ SIG Notes SI-GAL 123-4 2009 to appear

Suematsu K Shibuya T Flexible ProteinAlignment of 3D-Structures Allowing Dy-namic Transformation ISPSJ SIG Notes SIG-BIO 12-12 2008 pp 87-94本多渉田辺麻央矢野亜津子金久實バイオインフォマティクスシステムバイオロジーとKEGG生化学801094―11112008

121

The recent advances in biomedical research have been producing large-scaleultra-high dimensional ultra-heterogeneous data Due to these post-genomic re-search progresses our current mission is to create computational strategy for sys-tems biology and medicine towards translational bioinformatics With this missionwe have been developing computational methods for understanding life as systemand applying them to practical issues in medicine and biology

1 Computational Systems Biology

a Systematic reconstruction of TRANSPATHdata into Cell System Markup Language

Masao Nagasaki Ayumu Saito Chen Li EunaJeong Satoru Miyano

Many biological repositories store informationbased on experimental study of the biologicalprocesses within a cell such as protein-proteininteractions metabolic pathways signal trans-duction pathways or regulations of transcrip-tion factors and miRNA Unfortunately it is dif-ficult to directly use such information whengenerating simulation-based models Thus mod-eling rules for encoding biological knowledgeinto system-dynamics-oriented standardized for-mats would be very useful for full understand-ing of cellular dynamics at the system level Weselected the TRANSPATH database a manuallycurated high-quality pathway database whichprovides a rich source of cellular events in hu-mans mice and rats curated from over 31500papers In this work we defined 16 modeling

rules based on hybrid functional Petri net withextension (HFPNe) which is suitable for graphi-cal representation and simulation of biologicalprocesses In these modeling rules each Petrinet element is incorporated with Cell SystemOntology (CSO) to enable semantic interoper-ability of models As a formal ontology for bio-logical pathway modeling with dynamics CSOalso defines biological terminology and corre-sponding icons By combining HFPNe with theCSO features we made a method for transforming TRANSPATH data to simulation-based se-mantically valid models The results are en-coded into a biological pathway format CellSystem Markup Language (CSML) which easesthe exchange and integration of biological dataand models By using the 16 modeling rules97 of the reactions in TRANSPATH are con-verted into simulation-based models representedin CSML This reconstruction demonstrated thatit is possible to use our rules to generate quanti-tative models from static pathway descriptions

b Finding optimal Bayesian network given asuper-structure

Human Genome Center

Laboratory of DNA Information AnalysisDNA情報解析分野

Professor Satoru Miyano PhDAssociate Professor Seiya Imoto PhDAssistant Professor Masao Nagasaki PhDProject Lecturer Rui Yamaguchi PhDProject AssistantProfessor Yoshinori Tamada PhD

教 授 理学博士 宮 野 悟准教授 博士(数理学) 井 元 清 哉助 教 博士(理学) 長 正 朗特任講師 博士(理学) 山 口 類特任助教 博士(情報学) 玉 田 嘉 紀

122

Eric Perrier Seiya Imoto Satoru Miyano

Conventional approaches for learning Baye-sian network structure from data have disad-vantages in terms of complexity and lower accu-racy of their results However a recent empiri-cal study has shown that a hybrid algorithm im-proves sensitively accuracy and speed it learnsa skeleton with an independency test (IT) ap-proach and constrains on the directed acyclicgraphs considered during the search-and-scorephase Subsequently we defined the structuralconstraint by introducing the concept of super-structure S which is an undirected graph thatrestricts the search to networks whose skeletonis a subgraph of S We developed a super-structure constrained optimal search (COS) itstime complexity is upper bounded by O(γm

n)where γm<2 depends on the maximal degree mof S Empirically complexity depends on theaverage degree mrsquo and sparse structures allowlarger graphs to be calculated Our algorithm isfaster than an optimal search by several ordersand even finds more accurate results whengiven a sound super-structure Practically S canbe approximated by IT approaches significancelevel of the tests controls its sparseness enablingto control the trade-off between speed and accu-racy For incomplete super-structures a greedilypost-processed version (COS+) still enables tosignificantly outperform other heuristic searches

c Statistical inference of transcriptionalmodule-based gene networks from timecourse gene expression profiles by usingstate space models

Osamu Hirose Ryo Yoshida1 Seiya Imoto RuiYamaguchi Tomoyuki Higuchi1 D StephenCharnock-Jones2 Cristin Print3 Satoru Miy-ano 1Institute of Statistical Mathematics 2Cambridge University 3University of Auck-land

We developed a novel method based on thestate space model to identify the transcriptionalmodules and module-based gene networks si-multaneously The state space model has the po-tential to infer large-scale gene networks eg oforder 103 from time-course gene expression pro-files Particularly we succeeded in identificationof a cell cycle system by using the gene expres-sion profiles of Saccharomyces cerevisiae in whichthe length of the time-course and number ofgenes were 24 and 4382 respectively Howeverwhen analyzing shorter time-course data eg oflength 10 or less the parameter estimations ofthe state space model often fail due to overfit-ting To extend the applicability of the state

space model we provided an approach to usethe technical replicates of gene expression pro-files which are often measured in duplicate ortriplicate The use of technical replicates is im-portant for achieving highly-efficient inferenceof gene networks with short time-course dataThe potential of the proposed method weredemonstrated through the time-course analysisof the gene expression profiles of human umbili-cal vein endothelial cells undergoing growthfactor deprivation-induced apoptosis

d Predicting differences in gene regulatorysystems by state space models

Rui Yamaguchi Seiya Imoto Mai YamauchiMasao Nagasaki Ryo Yoshida1 Teppei Shima-mura Yosuke Hatanaka Kazuko Ueno To-moyuki Higuchi1 Noriko Gotoh Satoru Miy-ano

We developed a statistical method to predictdifferentially regulated genes of case and controlsamples from time-course gene expression databy leveraging unpredictability of the expressionpatterns from the underlying regulatory systeminferred by a state space model The proposedmethod can screen out genes that show differentpatterns but generated by the same regulationsin both samples since these patterns can be pre-dicted by the same model Our strategy consistsof three steps Firstly a gene regulatory systemis inferred from the control data by a state spacemodel Then the obtained model for the under-lying regulatory system of the control sample isused to predict the case data Finally by assess-ing the significance of the difference betweencase and predicted-case time-course data of eachgene we are able to detect the unpredictablegenes that are the candidate as the key differ-ences between the regulatory systems of caseand control cells We illustrate the whole proc-ess of the strategy by an actual example wherehuman small airway epithelial cell gene regula-tory systems were generated from novel timecourses of gene expressions following treatmentwith(case)without(control) the drug gefitiniban inhibitor for the epidermal growth factor re-ceptor tyrosine kinase Finally in gefitinib re-sponse data we succeeded in finding unpredict-able genes that are candidates of the specific tar-gets of gefitinib We also discussed differencesin regulatory systems for the unpredictablegenes The proposed method would be a prom-ising tool for identifying biomarkers and drugtarget genes

e Bayesian learning of biological pathwayson genomic data assimilation

123

Ryo Yoshida1 Masao Nagasaki Rui Yama-guchi Seiya Imoto Satoru Miyano TomoyukiHiguchi1

Mathematical modeling and simulation basedon biochemical rate equations provide us a rig-orous tool for unraveling complex mechanismsof biological pathways To proceed to simulationexperiments it is an essential first step to findeffective values of model parameters which aredifficult to measure from in vivo and in vitro ex-periments Furthermore once a set of hypotheti-cal models has been created any statistical crite-rion is needed to test the ability of the con-structed models and to proceed to model revi-sion We developed a new statistical technologytowards data-driven construction of in silico bio-logical pathways The method starts with aknowledge-based modeling with hybrid func-tional Petri net It then proceeds to the Bayesianlearning of model parameters for which experi-mental data are available This process exploitsquantitative measurements of evolving bio-chemical reactions eg gene expression dataAnother important issue that we consider is sta-tistical evaluation and comparison of the con-structed hypothetical pathways For this pur-pose we have developed a new Bayesianinformation-theoretic measure that assesses thepredictability and the biological robustness of insilico pathways

f Modeling nonlinear gene regulatory net-works from time series gene expressiondata

Andreacute Fujita Joatildeo Ricardo Sato5 HumbertoMiguel Garay-Malpartida5 Mari CleideSogayar5 Carlow Eduardo Ferreira5 SatoruMiyano 5University of Satildeo Paulo

In cells molecular networks such as generegulatory networks are the basis of biologicalcomplexity Therefore gene regulatory networkshave become the core of research in systems bi-ology Understanding the processes underlyingthe several extracellular regulators signal trans-duction protein-protein interactions and differ-ential gene expression processes requires de-tailed molecular description of the protein andgene networks involved To understand betterthese complex molecular networks and to infernew regulatory associations we developed astatistical method based on vector autoregres-sive models and Granger causality to estimatenonlinear gene regulatory networks from timeseries microarray data Most of the modelsavailable in the literature assume linearity in theinference of gene connections moreover these

models do not infer directionality in these con-nections Thus a priori biological knowledge isrequired However in pathological cases no apriori biological information is available Toovercome these problems we present the non-linear vector autoregressive (NVAR) model Wehave applied the NVAR model to estimate non-linear gene regulatory networks based entirelyon gene expression profiles obtained from DNAmicroarray experiments We showed the resultsobtained by NVAR through several simulationsand by the construction of three actual generegulatory networks (p53 NF-κB and c-Myc)for HeLa cells

g Fast grid layout algorithm for biologicalnetworks with sweep calculation

Kaname Kojima Masao Nagasaki Satoru Miy-ano

Properly drawn biological networks are ofgreat help in the comprehension of their charac-teristics The quality of the layouts for retrievedbiological networks is critical for pathway data-bases However since it is unrealistic to manu-ally draw biological networks for every re-trieval automatic drawing algorithms are essen-tial Grid layout algorithms handle various bio-logical properties such as aligning vertices hav-ing the same attributes and complicated posi-tional constraints according to their subcellularlocalizations thus they succeed in providingbiologically comprehensible layouts Howeverexisting grid layout algorithms are not suitablefor real-time drawing which is one of requisitesfor applications to pathway databases due totheir high-computational cost In addition theydo not consider edge directions and their result-ing layouts lack traceability for biochemical re-actions and gene regulations which are themost important features in biological networksWe devised a new calculation method termedsweep calculation and reduced the time com-plexity of the current grid layout algorithmsthrough its encoding and decoding processesWe conduct ed practical experiments by using95 pathway models of various sizes fromTRANSPATH and showed that our new gridlayout algorithm is much faster than existinggrid layout algorithms For the cost function weintroduced a new component that penalizes un-desirable edge directions to avoid the lack oftraceability in pathways due to the differencesin direction between in-edges and out-edges ofeach vertex

124

h Estimation of nonlinear gene regulatorynetworks via L1 regularized NVAR fromtime series gene expression data

Kaname Kojima Andreacute Fujita Teppei Shima-mura Seiya Imoto Satoru Miyano

Recently nonlinear vector autoregressive(NVAR) model based on Granger causality wasproposed to infer nonlinear gene regulatory net-works from time series gene expression dataSince NVAR requires a large number of parame-ters due to the basis expansion the length oftime series microarray data is insufficient for ac-curate parameter estimation and we need tolimit the size of the gene set strongly To ad-dress this limitation we employed L1 regulariza-tion technique to estimate NVAR Under L1

regularization direct parents of each gene canbe selected efficiently even when the number ofparameters exceeds the number of data samplesWe can thus estimate larger gene regulatory net-works more accurately than those from existingmethods Through the simulation study weverified the effectiveness of the proposedmethod by comparing its limitation in the num-ber of genes to that of the existing NVAR Theproposed method was also applied to time se-ries microarray data of Human hela cell cycle

i Multivariate gene expression analysis re-veals functional connectivity changes be-tween normaltumoral prostates

Andreacute Fujita Luciana Rodrigues Gomes5 JoatildeoRicardo Sato6 Rui Yamaguchi Carlos Edu-ardo Thomaz7 Mari Cleide Sogayar5 SatoruMiyano 6Universidade Federal do ABC 7Cen-tro Universitaacuterio da FEI

Principal Component Analysis (PCA) com-bined with the Maximum-entropy Linear Dis-criminant Analysis (MLDA) was applied in or-der to identify genes with the most discrimina-tive information between normal and tumoralprostatic tissues Data analysis was carried outusing three different approaches namely (i) dif-ferences in gene expression levels between nor-mal and tumoral conditions from a univariatepoint of view (ii) in a multivariate fashion usingMLDA and (iii) with a dependence network ap-proach Our results show that malignant trans-formation in the prostatic tissue is more relatedto functional connectivity changes in their de-pendence networks than to differential gene ex-pression The MYLK KLK2 KLK3 HAN11LTF CSRP1 and TGM4 genes presented signifi-cant changes in their functional connectivity be-tween normal and tumoral conditions and were

also classified as the top seven most informativegenes for the prostate cancer genesis process byour discriminant analysis Moreover among theidentified genes we found classically knownbiomarkers and genes which are closely relatedto tumoral prostate such as KLK3 and KLK2and several other potential ones We have dem-onstrated that changes in functional connectivitymay be implicit in the biological process whichrenders some genes more informative to dis-criminate between normal and tumoral condi-tions Using the proposed method namelyMLDA in order to analyze the multivariatecharacteristic of genes it was possible to capturethe changes in dependence networks which arerelated to cell transformation

j Rule-based reasoning for system dynam-ics in cell systems

Euna Jeong Masao Nagasaki Satoru Miyano

A system-dynamics-centered ontology calledthe Cell System Ontology (CSO) has been de-veloped for representation of diverse biologicalpathways Many of the pathway data based onthe ontology have been created from databasesvia data conversion or curated by expert biolo-gists It is essential to validate the pathway datawhich may cause unexpected issues such as se-mantic inconsistency and incompleteness Thispaper discusses three criteria for validating thepathway data based on CSO as follows (1)structurally correct models in terms of Petrinets (2) biologically correct models to capturebiological meaning and (3) systematically cor-rect models to reflect biological behaviors Si-multaneously we have investigated how logic-based rules can be used for the ontology to ex-tend its expressiveness and to complement theontology by reasoning which aims at qualifyingpathway knowledge Finally we show how theproposed approach helps exploring dynamicmodeling and simulation tasks without priorknowledge

k A novel strategy to search conserved tran-scription factor binding sites among coex-pressing genes in human

Yosuke Hatanaka Masao Nagasaki Rui Yam-aguchi Takeshi Obayashi Kazuyuki NumataAndreacute Fujita Teppei Shimamura YoshinoriTamada Seiya Imoto Kengo Kinoshita KentaNakai Satoru Miyano

We reported various transcription factor bind-ing sites (TFBSs) conserved among co-expressedgenes in human promoter region using expres-

125

sion and genomic data Assuming similar pro-moter structure induces similar transcriptionalregulation hence induces similar expressionprofile we compared the promoter structuresimilarities between co-expressed genes Com-prehensive TF binding site predictions for allhuman genes were conducted for 19777 pro-moter regions around the transcription start site(TSS) given from DBTSS and promoter similar-ity search were conducted among coexpressinggenes data provided from newly developedCOXPRESdb Combination of Position WeightMatrix (PWM) motif prediction and bootstrapmethod 7313 genes have at least one statisti-cally significant conserved TFBS We also ap-plied basket method analysis for seeking combi-natorial activities of those conserved TFBSs

l Simulation analysis for the effect of light-dark cycle on the entrainment in circadianrhythm

Natumi Mitou8 Yuto Ikegami8 Hiroshi Mat-suno8 Satoru Miyano Shin-ichi T Inouye88Yamaguchi University

Circadian rhythms of the living organisms are24hr oscillations found in behavior biochemistryand physiology Under constant conditions therhythms continue with their intrinsic periodlength which are rarely exact 24hr In this pa-per we examine the effects of light on the phaseof the gene expression rhythms derived fromthe interacting feedback network of a few clockgenes taking advantage of a computer simula-tion with Cell Illustrator The simulation resultssuggested that the interacting circadian feedbacknetwork at the molecular level is essential forphase dependence of the light effects observedin mammalian behavior Furthermore the simu-lation reproduced the biological observationsthat the range of entrainment to shorter orlonger than 24hr light-dark cycles is limitedcentering around 24hr Application of our modelto inter-time zone flight successfully demon-strated that 6 to 7 days are required to recoverfrom jet lag when traveling from Tokyo to NewYork

2 Statistical and Computational KnowledgeDiscovery

a Nonlinear regression modeling via regular-ized radial basis function networks

Tomohiro Ando9 Sadanori Konishi10 SeiyaImoto 9Keio University 10Kyushu University

The problem of constructing nonlinear regres-

sion models is investigated to analyze data withcomplex structure We introduced radial basisfunctions with hyperparameter that adjusts theamount of overlapping basis functions andadopts the information of the input and re-sponse variables By using the radial basis func-tions we constructed nonlinear regression mod-els with help of the technique of regularizationCrucial issues in the model building process arethe choices of a hyperparameter the number ofbasis functions and a smoothing parameter Wepresent information-theoretic criteria for evaluat-ing statistical models under model misspecifica-tion both for distributional and structural as-sumptions We used real data examples andMonte Carlo simulations to investigate the prop-erties of the proposed nonlinear regression mod-eling techniques The simulation results showedthat our nonlinear modeling performs well invarious situations and clear improvements wereobtained for the use of the hyperparameter inthe basis functions

b The GC and window-averaged DNA curva-ture profile of secondary metabolite genecluster in Aspergillus fumigatus genome

Jin Hwan Do Satoru Miyano

An immense variety of complex secondarymetabolites is produced by filamentous fungi in-cluding Aspergillus fumigatus a main inducer ofinvasive aspergillosis The identification of fun-gal secondary metabolite gene cluster is essen-tial for the characterization of fungal secondarymetabolism in terms of genetics and biochemis-try through recombinant technologies such asgene disruption and cloning Most of the predic-tion methods for secondary metabolite genecluster severely depend on homology searchesHowever homology-based approach has intrin-sic limitation to unknown or novel gene clusterWe analyzed the GC and window-averagedDNA curvature profile of 26 secondary metabo-lite gene clusters in the A fumigatus genome tofind out potential conserved features of secon-dary metabolite gene cluster Fifteen secondarymetabolite gene clusters showed a conservedpattern in window-averaged DNA curvatureprofile that is the DNA regions including sec-ondary metabolic signature genes such aspolyketide synthase nonribosomal peptide syn-thase andor dimethylallyl tryptophan synthaseconsisted of window-averaged DNA curvaturevalues lower than 018 and these DNA regionswere at least 20 kb Forty percent of secondarymetabolite gene clusters with this conserved pat-tern were related to severe regulation by a tran-scription factor LaeA Our result could be used

126

for identification of other fungal secondary me-tabolite gene clusters especially for secondarymetabolite gene cluster that is severely regulatedby LaeA or other proteins with similar functionto LaeA

c ExonMiner Web service for analysis ofGeneChip exon array data

Kazuyuki Numata Ryo Yoshida1 Masao Na-gasaki Ayumu Saito Seiya Imoto Satoru Miy-ano

Some splicing isoform-specific transcriptionalregulations are related to disease Therefore de-tection of disease specific splice variations is thefirst step for finding disease specific transcrip-tional regulations Affymetrix Human Exon 10ST Array can measure exon-level expressionprofiles that are suitable to find differentially ex-pressed exons in genome-wide scale Howeverexon array produces massive datasets that aremore than we can handle and analyze on per-sonal computer We have developed ExonMiner

that is the first all-in-one web service for analy-sis of exon array data to detect transcripts thathave significantly different splicing patterns intwo cells eg normal and cancer cells Exon-Miner can perform the following analyses (1)data normalization (2) statistical analysis basedon two-way ANOVA (3) finding transcriptswith significantly different splice patterns (4) ef-ficient visualization based on heatmaps and bar-plots and (5) meta-analysis to detect exon levelbiomarkers We implemented ExonMiner on thesupercomputer system of Human Genome Cen-ter in order to perform genome-wide analysisfor more than 300000 transcripts in exon arraydata which has the potential to reveal the aber-rant splice variations in cancer cells as exonlevel biomarkers ExonMiner is well suited foranalysis of exon array data and does not requireany installation of software except for internetbrowsers The URL of ExonMiner is httpaehgcjpexonminer Users can analyze full datasetof exon array data within hours by high-levelstatistical analysis with sound theoretical basisthat finds aberrant splice variants as biomarkers

Publications

1 Ando T Konishi S Imoto S Nonlinear re-gression modeling via regularized radial ba-sis function networks Journal of StatisticalPlanning and Inference 138 (11) 3616-36332008

2 Brazma A Miyano S Akutsu T Proceed-ings of the 6th Asia-Pacific BioinformaticsConference (APBC 2008) Imperial CollegePress 2008

3 Do JH Miyano S The GC and window-averaged DNA curvature profile of secon-dary metabolite gene cluster in Aspergillusfumigatus genome Applied Microbiologyand Biotechnology 80 (5) 841-847 2008

4 Fujita A Gomes LR Sato JR Yama-guchi R Thomaz CE Sogayar MC Miy-ano S Multivariate gene expression analysisreveals functional connectivity changes be-tween normaltumoral prostates BMC Sys-tems Biology 2 106 2008

5 Fujita A Sato JR Garay-Malpartida HM Sogayar MC Ferreira CE Miyano SModeling nonlinear gene regulatory net-works from time series gene expressiondata J Bioinformatics and ComputationalBiology 6 (5) 961-979 2008

6 Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Fujita A Shima-mura T Tamada Y Imoto S KinoshitaK Nakai K Miyano S A novel strategy tosearch conserved transcription factor bind-

ing sites among coexpressing genes in hu-man Genome Informatics 20 212-221 2008

7 Hirose O Yoshida R Imoto S Yama-guchi R Higuchi T Charnock-Jones DSPrint C Miyano S Statistical inference oftranscriptional module-based gene networksfrom time course gene expression profiles byusing state space models Bioinformatics 24(7) 932-942 2008

8 Hirose O Yoshida R Yamaguchi RImoto S Higuchi T Miyano S Analyzingtime course gene expression data with bio-logical and technical replicates to estimategene networks by state space models Proc2nd Asia International Conference on Mod-elling amp Simulation 940-946 2008 (AMS2008 Refereed conference)

9 Jeong E Nagasaki M Miyano S Rule-based reasoning for system dynamics in cellsystems Genome Informatics 20 25-362008

10 Kitakaze H Kanda M Nakatsuka HIkeda N Matsuno H Miyano S Predic-tion of fragile points for robustness checkingof cell systems IEICE TRANSACTIONS onInformation and Systems D J91-D (9) 2404-2417 2008

11 Knapp E-W Benson G Holzhutter H-GKanehisa M Miyano S (Eds) Genome In-formatics 20 2008

12 Kojima K Fujita A Shimamura T Imoto

127

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 6: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

37 D520-D525 2008Yamanishi Y Araki M Gutteridge A Honda

W and Kanehisa M Prediction of drug-targetinteraction networks from the integration ofchemical and genomic spaces Bioinformatics24 i232-i240 2008

Takarabe M Okuda S Itoh M Tokimatsu TGoto S and Kanehisa M Network analysisof adverse drug interactions Genome Informat-ics 20 252-259 2008

Hashimoto K Yoshizawa AC Okuda SKuma K Goto S and Kanehisa M The rep-ertoire of desaturases and elongases revealsfatty acid variations in 56 eukaryotic genomesJ Lipid Res 49 183-191 (2008)

Shibuya T Fast Hinge Detection Algorithmsfor Flexible Protein Structures IEEEACM

Transactions on Computational Biology and Bioin-formatics to appear

Shibuya T Searching Protein 3-D Structures inLinear Time Proc 13th Annual InternationalConference on Research in Computational Molecu-lar Biology (RECOMB 2009) 2009 to appear

Shibuya T Linear-Time Algorithm for Search-ing Protein 3-D Structures IPSJ SIG Notes SI-GAL 123-4 2009 to appear

Suematsu K Shibuya T Flexible ProteinAlignment of 3D-Structures Allowing Dy-namic Transformation ISPSJ SIG Notes SIG-BIO 12-12 2008 pp 87-94本多渉田辺麻央矢野亜津子金久實バイオインフォマティクスシステムバイオロジーとKEGG生化学801094―11112008

121

The recent advances in biomedical research have been producing large-scaleultra-high dimensional ultra-heterogeneous data Due to these post-genomic re-search progresses our current mission is to create computational strategy for sys-tems biology and medicine towards translational bioinformatics With this missionwe have been developing computational methods for understanding life as systemand applying them to practical issues in medicine and biology

1 Computational Systems Biology

a Systematic reconstruction of TRANSPATHdata into Cell System Markup Language

Masao Nagasaki Ayumu Saito Chen Li EunaJeong Satoru Miyano

Many biological repositories store informationbased on experimental study of the biologicalprocesses within a cell such as protein-proteininteractions metabolic pathways signal trans-duction pathways or regulations of transcrip-tion factors and miRNA Unfortunately it is dif-ficult to directly use such information whengenerating simulation-based models Thus mod-eling rules for encoding biological knowledgeinto system-dynamics-oriented standardized for-mats would be very useful for full understand-ing of cellular dynamics at the system level Weselected the TRANSPATH database a manuallycurated high-quality pathway database whichprovides a rich source of cellular events in hu-mans mice and rats curated from over 31500papers In this work we defined 16 modeling

rules based on hybrid functional Petri net withextension (HFPNe) which is suitable for graphi-cal representation and simulation of biologicalprocesses In these modeling rules each Petrinet element is incorporated with Cell SystemOntology (CSO) to enable semantic interoper-ability of models As a formal ontology for bio-logical pathway modeling with dynamics CSOalso defines biological terminology and corre-sponding icons By combining HFPNe with theCSO features we made a method for transforming TRANSPATH data to simulation-based se-mantically valid models The results are en-coded into a biological pathway format CellSystem Markup Language (CSML) which easesthe exchange and integration of biological dataand models By using the 16 modeling rules97 of the reactions in TRANSPATH are con-verted into simulation-based models representedin CSML This reconstruction demonstrated thatit is possible to use our rules to generate quanti-tative models from static pathway descriptions

b Finding optimal Bayesian network given asuper-structure

Human Genome Center

Laboratory of DNA Information AnalysisDNA情報解析分野

Professor Satoru Miyano PhDAssociate Professor Seiya Imoto PhDAssistant Professor Masao Nagasaki PhDProject Lecturer Rui Yamaguchi PhDProject AssistantProfessor Yoshinori Tamada PhD

教 授 理学博士 宮 野 悟准教授 博士(数理学) 井 元 清 哉助 教 博士(理学) 長 正 朗特任講師 博士(理学) 山 口 類特任助教 博士(情報学) 玉 田 嘉 紀

122

Eric Perrier Seiya Imoto Satoru Miyano

Conventional approaches for learning Baye-sian network structure from data have disad-vantages in terms of complexity and lower accu-racy of their results However a recent empiri-cal study has shown that a hybrid algorithm im-proves sensitively accuracy and speed it learnsa skeleton with an independency test (IT) ap-proach and constrains on the directed acyclicgraphs considered during the search-and-scorephase Subsequently we defined the structuralconstraint by introducing the concept of super-structure S which is an undirected graph thatrestricts the search to networks whose skeletonis a subgraph of S We developed a super-structure constrained optimal search (COS) itstime complexity is upper bounded by O(γm

n)where γm<2 depends on the maximal degree mof S Empirically complexity depends on theaverage degree mrsquo and sparse structures allowlarger graphs to be calculated Our algorithm isfaster than an optimal search by several ordersand even finds more accurate results whengiven a sound super-structure Practically S canbe approximated by IT approaches significancelevel of the tests controls its sparseness enablingto control the trade-off between speed and accu-racy For incomplete super-structures a greedilypost-processed version (COS+) still enables tosignificantly outperform other heuristic searches

c Statistical inference of transcriptionalmodule-based gene networks from timecourse gene expression profiles by usingstate space models

Osamu Hirose Ryo Yoshida1 Seiya Imoto RuiYamaguchi Tomoyuki Higuchi1 D StephenCharnock-Jones2 Cristin Print3 Satoru Miy-ano 1Institute of Statistical Mathematics 2Cambridge University 3University of Auck-land

We developed a novel method based on thestate space model to identify the transcriptionalmodules and module-based gene networks si-multaneously The state space model has the po-tential to infer large-scale gene networks eg oforder 103 from time-course gene expression pro-files Particularly we succeeded in identificationof a cell cycle system by using the gene expres-sion profiles of Saccharomyces cerevisiae in whichthe length of the time-course and number ofgenes were 24 and 4382 respectively Howeverwhen analyzing shorter time-course data eg oflength 10 or less the parameter estimations ofthe state space model often fail due to overfit-ting To extend the applicability of the state

space model we provided an approach to usethe technical replicates of gene expression pro-files which are often measured in duplicate ortriplicate The use of technical replicates is im-portant for achieving highly-efficient inferenceof gene networks with short time-course dataThe potential of the proposed method weredemonstrated through the time-course analysisof the gene expression profiles of human umbili-cal vein endothelial cells undergoing growthfactor deprivation-induced apoptosis

d Predicting differences in gene regulatorysystems by state space models

Rui Yamaguchi Seiya Imoto Mai YamauchiMasao Nagasaki Ryo Yoshida1 Teppei Shima-mura Yosuke Hatanaka Kazuko Ueno To-moyuki Higuchi1 Noriko Gotoh Satoru Miy-ano

We developed a statistical method to predictdifferentially regulated genes of case and controlsamples from time-course gene expression databy leveraging unpredictability of the expressionpatterns from the underlying regulatory systeminferred by a state space model The proposedmethod can screen out genes that show differentpatterns but generated by the same regulationsin both samples since these patterns can be pre-dicted by the same model Our strategy consistsof three steps Firstly a gene regulatory systemis inferred from the control data by a state spacemodel Then the obtained model for the under-lying regulatory system of the control sample isused to predict the case data Finally by assess-ing the significance of the difference betweencase and predicted-case time-course data of eachgene we are able to detect the unpredictablegenes that are the candidate as the key differ-ences between the regulatory systems of caseand control cells We illustrate the whole proc-ess of the strategy by an actual example wherehuman small airway epithelial cell gene regula-tory systems were generated from novel timecourses of gene expressions following treatmentwith(case)without(control) the drug gefitiniban inhibitor for the epidermal growth factor re-ceptor tyrosine kinase Finally in gefitinib re-sponse data we succeeded in finding unpredict-able genes that are candidates of the specific tar-gets of gefitinib We also discussed differencesin regulatory systems for the unpredictablegenes The proposed method would be a prom-ising tool for identifying biomarkers and drugtarget genes

e Bayesian learning of biological pathwayson genomic data assimilation

123

Ryo Yoshida1 Masao Nagasaki Rui Yama-guchi Seiya Imoto Satoru Miyano TomoyukiHiguchi1

Mathematical modeling and simulation basedon biochemical rate equations provide us a rig-orous tool for unraveling complex mechanismsof biological pathways To proceed to simulationexperiments it is an essential first step to findeffective values of model parameters which aredifficult to measure from in vivo and in vitro ex-periments Furthermore once a set of hypotheti-cal models has been created any statistical crite-rion is needed to test the ability of the con-structed models and to proceed to model revi-sion We developed a new statistical technologytowards data-driven construction of in silico bio-logical pathways The method starts with aknowledge-based modeling with hybrid func-tional Petri net It then proceeds to the Bayesianlearning of model parameters for which experi-mental data are available This process exploitsquantitative measurements of evolving bio-chemical reactions eg gene expression dataAnother important issue that we consider is sta-tistical evaluation and comparison of the con-structed hypothetical pathways For this pur-pose we have developed a new Bayesianinformation-theoretic measure that assesses thepredictability and the biological robustness of insilico pathways

f Modeling nonlinear gene regulatory net-works from time series gene expressiondata

Andreacute Fujita Joatildeo Ricardo Sato5 HumbertoMiguel Garay-Malpartida5 Mari CleideSogayar5 Carlow Eduardo Ferreira5 SatoruMiyano 5University of Satildeo Paulo

In cells molecular networks such as generegulatory networks are the basis of biologicalcomplexity Therefore gene regulatory networkshave become the core of research in systems bi-ology Understanding the processes underlyingthe several extracellular regulators signal trans-duction protein-protein interactions and differ-ential gene expression processes requires de-tailed molecular description of the protein andgene networks involved To understand betterthese complex molecular networks and to infernew regulatory associations we developed astatistical method based on vector autoregres-sive models and Granger causality to estimatenonlinear gene regulatory networks from timeseries microarray data Most of the modelsavailable in the literature assume linearity in theinference of gene connections moreover these

models do not infer directionality in these con-nections Thus a priori biological knowledge isrequired However in pathological cases no apriori biological information is available Toovercome these problems we present the non-linear vector autoregressive (NVAR) model Wehave applied the NVAR model to estimate non-linear gene regulatory networks based entirelyon gene expression profiles obtained from DNAmicroarray experiments We showed the resultsobtained by NVAR through several simulationsand by the construction of three actual generegulatory networks (p53 NF-κB and c-Myc)for HeLa cells

g Fast grid layout algorithm for biologicalnetworks with sweep calculation

Kaname Kojima Masao Nagasaki Satoru Miy-ano

Properly drawn biological networks are ofgreat help in the comprehension of their charac-teristics The quality of the layouts for retrievedbiological networks is critical for pathway data-bases However since it is unrealistic to manu-ally draw biological networks for every re-trieval automatic drawing algorithms are essen-tial Grid layout algorithms handle various bio-logical properties such as aligning vertices hav-ing the same attributes and complicated posi-tional constraints according to their subcellularlocalizations thus they succeed in providingbiologically comprehensible layouts Howeverexisting grid layout algorithms are not suitablefor real-time drawing which is one of requisitesfor applications to pathway databases due totheir high-computational cost In addition theydo not consider edge directions and their result-ing layouts lack traceability for biochemical re-actions and gene regulations which are themost important features in biological networksWe devised a new calculation method termedsweep calculation and reduced the time com-plexity of the current grid layout algorithmsthrough its encoding and decoding processesWe conduct ed practical experiments by using95 pathway models of various sizes fromTRANSPATH and showed that our new gridlayout algorithm is much faster than existinggrid layout algorithms For the cost function weintroduced a new component that penalizes un-desirable edge directions to avoid the lack oftraceability in pathways due to the differencesin direction between in-edges and out-edges ofeach vertex

124

h Estimation of nonlinear gene regulatorynetworks via L1 regularized NVAR fromtime series gene expression data

Kaname Kojima Andreacute Fujita Teppei Shima-mura Seiya Imoto Satoru Miyano

Recently nonlinear vector autoregressive(NVAR) model based on Granger causality wasproposed to infer nonlinear gene regulatory net-works from time series gene expression dataSince NVAR requires a large number of parame-ters due to the basis expansion the length oftime series microarray data is insufficient for ac-curate parameter estimation and we need tolimit the size of the gene set strongly To ad-dress this limitation we employed L1 regulariza-tion technique to estimate NVAR Under L1

regularization direct parents of each gene canbe selected efficiently even when the number ofparameters exceeds the number of data samplesWe can thus estimate larger gene regulatory net-works more accurately than those from existingmethods Through the simulation study weverified the effectiveness of the proposedmethod by comparing its limitation in the num-ber of genes to that of the existing NVAR Theproposed method was also applied to time se-ries microarray data of Human hela cell cycle

i Multivariate gene expression analysis re-veals functional connectivity changes be-tween normaltumoral prostates

Andreacute Fujita Luciana Rodrigues Gomes5 JoatildeoRicardo Sato6 Rui Yamaguchi Carlos Edu-ardo Thomaz7 Mari Cleide Sogayar5 SatoruMiyano 6Universidade Federal do ABC 7Cen-tro Universitaacuterio da FEI

Principal Component Analysis (PCA) com-bined with the Maximum-entropy Linear Dis-criminant Analysis (MLDA) was applied in or-der to identify genes with the most discrimina-tive information between normal and tumoralprostatic tissues Data analysis was carried outusing three different approaches namely (i) dif-ferences in gene expression levels between nor-mal and tumoral conditions from a univariatepoint of view (ii) in a multivariate fashion usingMLDA and (iii) with a dependence network ap-proach Our results show that malignant trans-formation in the prostatic tissue is more relatedto functional connectivity changes in their de-pendence networks than to differential gene ex-pression The MYLK KLK2 KLK3 HAN11LTF CSRP1 and TGM4 genes presented signifi-cant changes in their functional connectivity be-tween normal and tumoral conditions and were

also classified as the top seven most informativegenes for the prostate cancer genesis process byour discriminant analysis Moreover among theidentified genes we found classically knownbiomarkers and genes which are closely relatedto tumoral prostate such as KLK3 and KLK2and several other potential ones We have dem-onstrated that changes in functional connectivitymay be implicit in the biological process whichrenders some genes more informative to dis-criminate between normal and tumoral condi-tions Using the proposed method namelyMLDA in order to analyze the multivariatecharacteristic of genes it was possible to capturethe changes in dependence networks which arerelated to cell transformation

j Rule-based reasoning for system dynam-ics in cell systems

Euna Jeong Masao Nagasaki Satoru Miyano

A system-dynamics-centered ontology calledthe Cell System Ontology (CSO) has been de-veloped for representation of diverse biologicalpathways Many of the pathway data based onthe ontology have been created from databasesvia data conversion or curated by expert biolo-gists It is essential to validate the pathway datawhich may cause unexpected issues such as se-mantic inconsistency and incompleteness Thispaper discusses three criteria for validating thepathway data based on CSO as follows (1)structurally correct models in terms of Petrinets (2) biologically correct models to capturebiological meaning and (3) systematically cor-rect models to reflect biological behaviors Si-multaneously we have investigated how logic-based rules can be used for the ontology to ex-tend its expressiveness and to complement theontology by reasoning which aims at qualifyingpathway knowledge Finally we show how theproposed approach helps exploring dynamicmodeling and simulation tasks without priorknowledge

k A novel strategy to search conserved tran-scription factor binding sites among coex-pressing genes in human

Yosuke Hatanaka Masao Nagasaki Rui Yam-aguchi Takeshi Obayashi Kazuyuki NumataAndreacute Fujita Teppei Shimamura YoshinoriTamada Seiya Imoto Kengo Kinoshita KentaNakai Satoru Miyano

We reported various transcription factor bind-ing sites (TFBSs) conserved among co-expressedgenes in human promoter region using expres-

125

sion and genomic data Assuming similar pro-moter structure induces similar transcriptionalregulation hence induces similar expressionprofile we compared the promoter structuresimilarities between co-expressed genes Com-prehensive TF binding site predictions for allhuman genes were conducted for 19777 pro-moter regions around the transcription start site(TSS) given from DBTSS and promoter similar-ity search were conducted among coexpressinggenes data provided from newly developedCOXPRESdb Combination of Position WeightMatrix (PWM) motif prediction and bootstrapmethod 7313 genes have at least one statisti-cally significant conserved TFBS We also ap-plied basket method analysis for seeking combi-natorial activities of those conserved TFBSs

l Simulation analysis for the effect of light-dark cycle on the entrainment in circadianrhythm

Natumi Mitou8 Yuto Ikegami8 Hiroshi Mat-suno8 Satoru Miyano Shin-ichi T Inouye88Yamaguchi University

Circadian rhythms of the living organisms are24hr oscillations found in behavior biochemistryand physiology Under constant conditions therhythms continue with their intrinsic periodlength which are rarely exact 24hr In this pa-per we examine the effects of light on the phaseof the gene expression rhythms derived fromthe interacting feedback network of a few clockgenes taking advantage of a computer simula-tion with Cell Illustrator The simulation resultssuggested that the interacting circadian feedbacknetwork at the molecular level is essential forphase dependence of the light effects observedin mammalian behavior Furthermore the simu-lation reproduced the biological observationsthat the range of entrainment to shorter orlonger than 24hr light-dark cycles is limitedcentering around 24hr Application of our modelto inter-time zone flight successfully demon-strated that 6 to 7 days are required to recoverfrom jet lag when traveling from Tokyo to NewYork

2 Statistical and Computational KnowledgeDiscovery

a Nonlinear regression modeling via regular-ized radial basis function networks

Tomohiro Ando9 Sadanori Konishi10 SeiyaImoto 9Keio University 10Kyushu University

The problem of constructing nonlinear regres-

sion models is investigated to analyze data withcomplex structure We introduced radial basisfunctions with hyperparameter that adjusts theamount of overlapping basis functions andadopts the information of the input and re-sponse variables By using the radial basis func-tions we constructed nonlinear regression mod-els with help of the technique of regularizationCrucial issues in the model building process arethe choices of a hyperparameter the number ofbasis functions and a smoothing parameter Wepresent information-theoretic criteria for evaluat-ing statistical models under model misspecifica-tion both for distributional and structural as-sumptions We used real data examples andMonte Carlo simulations to investigate the prop-erties of the proposed nonlinear regression mod-eling techniques The simulation results showedthat our nonlinear modeling performs well invarious situations and clear improvements wereobtained for the use of the hyperparameter inthe basis functions

b The GC and window-averaged DNA curva-ture profile of secondary metabolite genecluster in Aspergillus fumigatus genome

Jin Hwan Do Satoru Miyano

An immense variety of complex secondarymetabolites is produced by filamentous fungi in-cluding Aspergillus fumigatus a main inducer ofinvasive aspergillosis The identification of fun-gal secondary metabolite gene cluster is essen-tial for the characterization of fungal secondarymetabolism in terms of genetics and biochemis-try through recombinant technologies such asgene disruption and cloning Most of the predic-tion methods for secondary metabolite genecluster severely depend on homology searchesHowever homology-based approach has intrin-sic limitation to unknown or novel gene clusterWe analyzed the GC and window-averagedDNA curvature profile of 26 secondary metabo-lite gene clusters in the A fumigatus genome tofind out potential conserved features of secon-dary metabolite gene cluster Fifteen secondarymetabolite gene clusters showed a conservedpattern in window-averaged DNA curvatureprofile that is the DNA regions including sec-ondary metabolic signature genes such aspolyketide synthase nonribosomal peptide syn-thase andor dimethylallyl tryptophan synthaseconsisted of window-averaged DNA curvaturevalues lower than 018 and these DNA regionswere at least 20 kb Forty percent of secondarymetabolite gene clusters with this conserved pat-tern were related to severe regulation by a tran-scription factor LaeA Our result could be used

126

for identification of other fungal secondary me-tabolite gene clusters especially for secondarymetabolite gene cluster that is severely regulatedby LaeA or other proteins with similar functionto LaeA

c ExonMiner Web service for analysis ofGeneChip exon array data

Kazuyuki Numata Ryo Yoshida1 Masao Na-gasaki Ayumu Saito Seiya Imoto Satoru Miy-ano

Some splicing isoform-specific transcriptionalregulations are related to disease Therefore de-tection of disease specific splice variations is thefirst step for finding disease specific transcrip-tional regulations Affymetrix Human Exon 10ST Array can measure exon-level expressionprofiles that are suitable to find differentially ex-pressed exons in genome-wide scale Howeverexon array produces massive datasets that aremore than we can handle and analyze on per-sonal computer We have developed ExonMiner

that is the first all-in-one web service for analy-sis of exon array data to detect transcripts thathave significantly different splicing patterns intwo cells eg normal and cancer cells Exon-Miner can perform the following analyses (1)data normalization (2) statistical analysis basedon two-way ANOVA (3) finding transcriptswith significantly different splice patterns (4) ef-ficient visualization based on heatmaps and bar-plots and (5) meta-analysis to detect exon levelbiomarkers We implemented ExonMiner on thesupercomputer system of Human Genome Cen-ter in order to perform genome-wide analysisfor more than 300000 transcripts in exon arraydata which has the potential to reveal the aber-rant splice variations in cancer cells as exonlevel biomarkers ExonMiner is well suited foranalysis of exon array data and does not requireany installation of software except for internetbrowsers The URL of ExonMiner is httpaehgcjpexonminer Users can analyze full datasetof exon array data within hours by high-levelstatistical analysis with sound theoretical basisthat finds aberrant splice variants as biomarkers

Publications

1 Ando T Konishi S Imoto S Nonlinear re-gression modeling via regularized radial ba-sis function networks Journal of StatisticalPlanning and Inference 138 (11) 3616-36332008

2 Brazma A Miyano S Akutsu T Proceed-ings of the 6th Asia-Pacific BioinformaticsConference (APBC 2008) Imperial CollegePress 2008

3 Do JH Miyano S The GC and window-averaged DNA curvature profile of secon-dary metabolite gene cluster in Aspergillusfumigatus genome Applied Microbiologyand Biotechnology 80 (5) 841-847 2008

4 Fujita A Gomes LR Sato JR Yama-guchi R Thomaz CE Sogayar MC Miy-ano S Multivariate gene expression analysisreveals functional connectivity changes be-tween normaltumoral prostates BMC Sys-tems Biology 2 106 2008

5 Fujita A Sato JR Garay-Malpartida HM Sogayar MC Ferreira CE Miyano SModeling nonlinear gene regulatory net-works from time series gene expressiondata J Bioinformatics and ComputationalBiology 6 (5) 961-979 2008

6 Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Fujita A Shima-mura T Tamada Y Imoto S KinoshitaK Nakai K Miyano S A novel strategy tosearch conserved transcription factor bind-

ing sites among coexpressing genes in hu-man Genome Informatics 20 212-221 2008

7 Hirose O Yoshida R Imoto S Yama-guchi R Higuchi T Charnock-Jones DSPrint C Miyano S Statistical inference oftranscriptional module-based gene networksfrom time course gene expression profiles byusing state space models Bioinformatics 24(7) 932-942 2008

8 Hirose O Yoshida R Yamaguchi RImoto S Higuchi T Miyano S Analyzingtime course gene expression data with bio-logical and technical replicates to estimategene networks by state space models Proc2nd Asia International Conference on Mod-elling amp Simulation 940-946 2008 (AMS2008 Refereed conference)

9 Jeong E Nagasaki M Miyano S Rule-based reasoning for system dynamics in cellsystems Genome Informatics 20 25-362008

10 Kitakaze H Kanda M Nakatsuka HIkeda N Matsuno H Miyano S Predic-tion of fragile points for robustness checkingof cell systems IEICE TRANSACTIONS onInformation and Systems D J91-D (9) 2404-2417 2008

11 Knapp E-W Benson G Holzhutter H-GKanehisa M Miyano S (Eds) Genome In-formatics 20 2008

12 Kojima K Fujita A Shimamura T Imoto

127

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 7: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

The recent advances in biomedical research have been producing large-scaleultra-high dimensional ultra-heterogeneous data Due to these post-genomic re-search progresses our current mission is to create computational strategy for sys-tems biology and medicine towards translational bioinformatics With this missionwe have been developing computational methods for understanding life as systemand applying them to practical issues in medicine and biology

1 Computational Systems Biology

a Systematic reconstruction of TRANSPATHdata into Cell System Markup Language

Masao Nagasaki Ayumu Saito Chen Li EunaJeong Satoru Miyano

Many biological repositories store informationbased on experimental study of the biologicalprocesses within a cell such as protein-proteininteractions metabolic pathways signal trans-duction pathways or regulations of transcrip-tion factors and miRNA Unfortunately it is dif-ficult to directly use such information whengenerating simulation-based models Thus mod-eling rules for encoding biological knowledgeinto system-dynamics-oriented standardized for-mats would be very useful for full understand-ing of cellular dynamics at the system level Weselected the TRANSPATH database a manuallycurated high-quality pathway database whichprovides a rich source of cellular events in hu-mans mice and rats curated from over 31500papers In this work we defined 16 modeling

rules based on hybrid functional Petri net withextension (HFPNe) which is suitable for graphi-cal representation and simulation of biologicalprocesses In these modeling rules each Petrinet element is incorporated with Cell SystemOntology (CSO) to enable semantic interoper-ability of models As a formal ontology for bio-logical pathway modeling with dynamics CSOalso defines biological terminology and corre-sponding icons By combining HFPNe with theCSO features we made a method for transforming TRANSPATH data to simulation-based se-mantically valid models The results are en-coded into a biological pathway format CellSystem Markup Language (CSML) which easesthe exchange and integration of biological dataand models By using the 16 modeling rules97 of the reactions in TRANSPATH are con-verted into simulation-based models representedin CSML This reconstruction demonstrated thatit is possible to use our rules to generate quanti-tative models from static pathway descriptions

b Finding optimal Bayesian network given asuper-structure

Human Genome Center

Laboratory of DNA Information AnalysisDNA情報解析分野

Professor Satoru Miyano PhDAssociate Professor Seiya Imoto PhDAssistant Professor Masao Nagasaki PhDProject Lecturer Rui Yamaguchi PhDProject AssistantProfessor Yoshinori Tamada PhD

教 授 理学博士 宮 野 悟准教授 博士(数理学) 井 元 清 哉助 教 博士(理学) 長 正 朗特任講師 博士(理学) 山 口 類特任助教 博士(情報学) 玉 田 嘉 紀

122

Eric Perrier Seiya Imoto Satoru Miyano

Conventional approaches for learning Baye-sian network structure from data have disad-vantages in terms of complexity and lower accu-racy of their results However a recent empiri-cal study has shown that a hybrid algorithm im-proves sensitively accuracy and speed it learnsa skeleton with an independency test (IT) ap-proach and constrains on the directed acyclicgraphs considered during the search-and-scorephase Subsequently we defined the structuralconstraint by introducing the concept of super-structure S which is an undirected graph thatrestricts the search to networks whose skeletonis a subgraph of S We developed a super-structure constrained optimal search (COS) itstime complexity is upper bounded by O(γm

n)where γm<2 depends on the maximal degree mof S Empirically complexity depends on theaverage degree mrsquo and sparse structures allowlarger graphs to be calculated Our algorithm isfaster than an optimal search by several ordersand even finds more accurate results whengiven a sound super-structure Practically S canbe approximated by IT approaches significancelevel of the tests controls its sparseness enablingto control the trade-off between speed and accu-racy For incomplete super-structures a greedilypost-processed version (COS+) still enables tosignificantly outperform other heuristic searches

c Statistical inference of transcriptionalmodule-based gene networks from timecourse gene expression profiles by usingstate space models

Osamu Hirose Ryo Yoshida1 Seiya Imoto RuiYamaguchi Tomoyuki Higuchi1 D StephenCharnock-Jones2 Cristin Print3 Satoru Miy-ano 1Institute of Statistical Mathematics 2Cambridge University 3University of Auck-land

We developed a novel method based on thestate space model to identify the transcriptionalmodules and module-based gene networks si-multaneously The state space model has the po-tential to infer large-scale gene networks eg oforder 103 from time-course gene expression pro-files Particularly we succeeded in identificationof a cell cycle system by using the gene expres-sion profiles of Saccharomyces cerevisiae in whichthe length of the time-course and number ofgenes were 24 and 4382 respectively Howeverwhen analyzing shorter time-course data eg oflength 10 or less the parameter estimations ofthe state space model often fail due to overfit-ting To extend the applicability of the state

space model we provided an approach to usethe technical replicates of gene expression pro-files which are often measured in duplicate ortriplicate The use of technical replicates is im-portant for achieving highly-efficient inferenceof gene networks with short time-course dataThe potential of the proposed method weredemonstrated through the time-course analysisof the gene expression profiles of human umbili-cal vein endothelial cells undergoing growthfactor deprivation-induced apoptosis

d Predicting differences in gene regulatorysystems by state space models

Rui Yamaguchi Seiya Imoto Mai YamauchiMasao Nagasaki Ryo Yoshida1 Teppei Shima-mura Yosuke Hatanaka Kazuko Ueno To-moyuki Higuchi1 Noriko Gotoh Satoru Miy-ano

We developed a statistical method to predictdifferentially regulated genes of case and controlsamples from time-course gene expression databy leveraging unpredictability of the expressionpatterns from the underlying regulatory systeminferred by a state space model The proposedmethod can screen out genes that show differentpatterns but generated by the same regulationsin both samples since these patterns can be pre-dicted by the same model Our strategy consistsof three steps Firstly a gene regulatory systemis inferred from the control data by a state spacemodel Then the obtained model for the under-lying regulatory system of the control sample isused to predict the case data Finally by assess-ing the significance of the difference betweencase and predicted-case time-course data of eachgene we are able to detect the unpredictablegenes that are the candidate as the key differ-ences between the regulatory systems of caseand control cells We illustrate the whole proc-ess of the strategy by an actual example wherehuman small airway epithelial cell gene regula-tory systems were generated from novel timecourses of gene expressions following treatmentwith(case)without(control) the drug gefitiniban inhibitor for the epidermal growth factor re-ceptor tyrosine kinase Finally in gefitinib re-sponse data we succeeded in finding unpredict-able genes that are candidates of the specific tar-gets of gefitinib We also discussed differencesin regulatory systems for the unpredictablegenes The proposed method would be a prom-ising tool for identifying biomarkers and drugtarget genes

e Bayesian learning of biological pathwayson genomic data assimilation

123

Ryo Yoshida1 Masao Nagasaki Rui Yama-guchi Seiya Imoto Satoru Miyano TomoyukiHiguchi1

Mathematical modeling and simulation basedon biochemical rate equations provide us a rig-orous tool for unraveling complex mechanismsof biological pathways To proceed to simulationexperiments it is an essential first step to findeffective values of model parameters which aredifficult to measure from in vivo and in vitro ex-periments Furthermore once a set of hypotheti-cal models has been created any statistical crite-rion is needed to test the ability of the con-structed models and to proceed to model revi-sion We developed a new statistical technologytowards data-driven construction of in silico bio-logical pathways The method starts with aknowledge-based modeling with hybrid func-tional Petri net It then proceeds to the Bayesianlearning of model parameters for which experi-mental data are available This process exploitsquantitative measurements of evolving bio-chemical reactions eg gene expression dataAnother important issue that we consider is sta-tistical evaluation and comparison of the con-structed hypothetical pathways For this pur-pose we have developed a new Bayesianinformation-theoretic measure that assesses thepredictability and the biological robustness of insilico pathways

f Modeling nonlinear gene regulatory net-works from time series gene expressiondata

Andreacute Fujita Joatildeo Ricardo Sato5 HumbertoMiguel Garay-Malpartida5 Mari CleideSogayar5 Carlow Eduardo Ferreira5 SatoruMiyano 5University of Satildeo Paulo

In cells molecular networks such as generegulatory networks are the basis of biologicalcomplexity Therefore gene regulatory networkshave become the core of research in systems bi-ology Understanding the processes underlyingthe several extracellular regulators signal trans-duction protein-protein interactions and differ-ential gene expression processes requires de-tailed molecular description of the protein andgene networks involved To understand betterthese complex molecular networks and to infernew regulatory associations we developed astatistical method based on vector autoregres-sive models and Granger causality to estimatenonlinear gene regulatory networks from timeseries microarray data Most of the modelsavailable in the literature assume linearity in theinference of gene connections moreover these

models do not infer directionality in these con-nections Thus a priori biological knowledge isrequired However in pathological cases no apriori biological information is available Toovercome these problems we present the non-linear vector autoregressive (NVAR) model Wehave applied the NVAR model to estimate non-linear gene regulatory networks based entirelyon gene expression profiles obtained from DNAmicroarray experiments We showed the resultsobtained by NVAR through several simulationsand by the construction of three actual generegulatory networks (p53 NF-κB and c-Myc)for HeLa cells

g Fast grid layout algorithm for biologicalnetworks with sweep calculation

Kaname Kojima Masao Nagasaki Satoru Miy-ano

Properly drawn biological networks are ofgreat help in the comprehension of their charac-teristics The quality of the layouts for retrievedbiological networks is critical for pathway data-bases However since it is unrealistic to manu-ally draw biological networks for every re-trieval automatic drawing algorithms are essen-tial Grid layout algorithms handle various bio-logical properties such as aligning vertices hav-ing the same attributes and complicated posi-tional constraints according to their subcellularlocalizations thus they succeed in providingbiologically comprehensible layouts Howeverexisting grid layout algorithms are not suitablefor real-time drawing which is one of requisitesfor applications to pathway databases due totheir high-computational cost In addition theydo not consider edge directions and their result-ing layouts lack traceability for biochemical re-actions and gene regulations which are themost important features in biological networksWe devised a new calculation method termedsweep calculation and reduced the time com-plexity of the current grid layout algorithmsthrough its encoding and decoding processesWe conduct ed practical experiments by using95 pathway models of various sizes fromTRANSPATH and showed that our new gridlayout algorithm is much faster than existinggrid layout algorithms For the cost function weintroduced a new component that penalizes un-desirable edge directions to avoid the lack oftraceability in pathways due to the differencesin direction between in-edges and out-edges ofeach vertex

124

h Estimation of nonlinear gene regulatorynetworks via L1 regularized NVAR fromtime series gene expression data

Kaname Kojima Andreacute Fujita Teppei Shima-mura Seiya Imoto Satoru Miyano

Recently nonlinear vector autoregressive(NVAR) model based on Granger causality wasproposed to infer nonlinear gene regulatory net-works from time series gene expression dataSince NVAR requires a large number of parame-ters due to the basis expansion the length oftime series microarray data is insufficient for ac-curate parameter estimation and we need tolimit the size of the gene set strongly To ad-dress this limitation we employed L1 regulariza-tion technique to estimate NVAR Under L1

regularization direct parents of each gene canbe selected efficiently even when the number ofparameters exceeds the number of data samplesWe can thus estimate larger gene regulatory net-works more accurately than those from existingmethods Through the simulation study weverified the effectiveness of the proposedmethod by comparing its limitation in the num-ber of genes to that of the existing NVAR Theproposed method was also applied to time se-ries microarray data of Human hela cell cycle

i Multivariate gene expression analysis re-veals functional connectivity changes be-tween normaltumoral prostates

Andreacute Fujita Luciana Rodrigues Gomes5 JoatildeoRicardo Sato6 Rui Yamaguchi Carlos Edu-ardo Thomaz7 Mari Cleide Sogayar5 SatoruMiyano 6Universidade Federal do ABC 7Cen-tro Universitaacuterio da FEI

Principal Component Analysis (PCA) com-bined with the Maximum-entropy Linear Dis-criminant Analysis (MLDA) was applied in or-der to identify genes with the most discrimina-tive information between normal and tumoralprostatic tissues Data analysis was carried outusing three different approaches namely (i) dif-ferences in gene expression levels between nor-mal and tumoral conditions from a univariatepoint of view (ii) in a multivariate fashion usingMLDA and (iii) with a dependence network ap-proach Our results show that malignant trans-formation in the prostatic tissue is more relatedto functional connectivity changes in their de-pendence networks than to differential gene ex-pression The MYLK KLK2 KLK3 HAN11LTF CSRP1 and TGM4 genes presented signifi-cant changes in their functional connectivity be-tween normal and tumoral conditions and were

also classified as the top seven most informativegenes for the prostate cancer genesis process byour discriminant analysis Moreover among theidentified genes we found classically knownbiomarkers and genes which are closely relatedto tumoral prostate such as KLK3 and KLK2and several other potential ones We have dem-onstrated that changes in functional connectivitymay be implicit in the biological process whichrenders some genes more informative to dis-criminate between normal and tumoral condi-tions Using the proposed method namelyMLDA in order to analyze the multivariatecharacteristic of genes it was possible to capturethe changes in dependence networks which arerelated to cell transformation

j Rule-based reasoning for system dynam-ics in cell systems

Euna Jeong Masao Nagasaki Satoru Miyano

A system-dynamics-centered ontology calledthe Cell System Ontology (CSO) has been de-veloped for representation of diverse biologicalpathways Many of the pathway data based onthe ontology have been created from databasesvia data conversion or curated by expert biolo-gists It is essential to validate the pathway datawhich may cause unexpected issues such as se-mantic inconsistency and incompleteness Thispaper discusses three criteria for validating thepathway data based on CSO as follows (1)structurally correct models in terms of Petrinets (2) biologically correct models to capturebiological meaning and (3) systematically cor-rect models to reflect biological behaviors Si-multaneously we have investigated how logic-based rules can be used for the ontology to ex-tend its expressiveness and to complement theontology by reasoning which aims at qualifyingpathway knowledge Finally we show how theproposed approach helps exploring dynamicmodeling and simulation tasks without priorknowledge

k A novel strategy to search conserved tran-scription factor binding sites among coex-pressing genes in human

Yosuke Hatanaka Masao Nagasaki Rui Yam-aguchi Takeshi Obayashi Kazuyuki NumataAndreacute Fujita Teppei Shimamura YoshinoriTamada Seiya Imoto Kengo Kinoshita KentaNakai Satoru Miyano

We reported various transcription factor bind-ing sites (TFBSs) conserved among co-expressedgenes in human promoter region using expres-

125

sion and genomic data Assuming similar pro-moter structure induces similar transcriptionalregulation hence induces similar expressionprofile we compared the promoter structuresimilarities between co-expressed genes Com-prehensive TF binding site predictions for allhuman genes were conducted for 19777 pro-moter regions around the transcription start site(TSS) given from DBTSS and promoter similar-ity search were conducted among coexpressinggenes data provided from newly developedCOXPRESdb Combination of Position WeightMatrix (PWM) motif prediction and bootstrapmethod 7313 genes have at least one statisti-cally significant conserved TFBS We also ap-plied basket method analysis for seeking combi-natorial activities of those conserved TFBSs

l Simulation analysis for the effect of light-dark cycle on the entrainment in circadianrhythm

Natumi Mitou8 Yuto Ikegami8 Hiroshi Mat-suno8 Satoru Miyano Shin-ichi T Inouye88Yamaguchi University

Circadian rhythms of the living organisms are24hr oscillations found in behavior biochemistryand physiology Under constant conditions therhythms continue with their intrinsic periodlength which are rarely exact 24hr In this pa-per we examine the effects of light on the phaseof the gene expression rhythms derived fromthe interacting feedback network of a few clockgenes taking advantage of a computer simula-tion with Cell Illustrator The simulation resultssuggested that the interacting circadian feedbacknetwork at the molecular level is essential forphase dependence of the light effects observedin mammalian behavior Furthermore the simu-lation reproduced the biological observationsthat the range of entrainment to shorter orlonger than 24hr light-dark cycles is limitedcentering around 24hr Application of our modelto inter-time zone flight successfully demon-strated that 6 to 7 days are required to recoverfrom jet lag when traveling from Tokyo to NewYork

2 Statistical and Computational KnowledgeDiscovery

a Nonlinear regression modeling via regular-ized radial basis function networks

Tomohiro Ando9 Sadanori Konishi10 SeiyaImoto 9Keio University 10Kyushu University

The problem of constructing nonlinear regres-

sion models is investigated to analyze data withcomplex structure We introduced radial basisfunctions with hyperparameter that adjusts theamount of overlapping basis functions andadopts the information of the input and re-sponse variables By using the radial basis func-tions we constructed nonlinear regression mod-els with help of the technique of regularizationCrucial issues in the model building process arethe choices of a hyperparameter the number ofbasis functions and a smoothing parameter Wepresent information-theoretic criteria for evaluat-ing statistical models under model misspecifica-tion both for distributional and structural as-sumptions We used real data examples andMonte Carlo simulations to investigate the prop-erties of the proposed nonlinear regression mod-eling techniques The simulation results showedthat our nonlinear modeling performs well invarious situations and clear improvements wereobtained for the use of the hyperparameter inthe basis functions

b The GC and window-averaged DNA curva-ture profile of secondary metabolite genecluster in Aspergillus fumigatus genome

Jin Hwan Do Satoru Miyano

An immense variety of complex secondarymetabolites is produced by filamentous fungi in-cluding Aspergillus fumigatus a main inducer ofinvasive aspergillosis The identification of fun-gal secondary metabolite gene cluster is essen-tial for the characterization of fungal secondarymetabolism in terms of genetics and biochemis-try through recombinant technologies such asgene disruption and cloning Most of the predic-tion methods for secondary metabolite genecluster severely depend on homology searchesHowever homology-based approach has intrin-sic limitation to unknown or novel gene clusterWe analyzed the GC and window-averagedDNA curvature profile of 26 secondary metabo-lite gene clusters in the A fumigatus genome tofind out potential conserved features of secon-dary metabolite gene cluster Fifteen secondarymetabolite gene clusters showed a conservedpattern in window-averaged DNA curvatureprofile that is the DNA regions including sec-ondary metabolic signature genes such aspolyketide synthase nonribosomal peptide syn-thase andor dimethylallyl tryptophan synthaseconsisted of window-averaged DNA curvaturevalues lower than 018 and these DNA regionswere at least 20 kb Forty percent of secondarymetabolite gene clusters with this conserved pat-tern were related to severe regulation by a tran-scription factor LaeA Our result could be used

126

for identification of other fungal secondary me-tabolite gene clusters especially for secondarymetabolite gene cluster that is severely regulatedby LaeA or other proteins with similar functionto LaeA

c ExonMiner Web service for analysis ofGeneChip exon array data

Kazuyuki Numata Ryo Yoshida1 Masao Na-gasaki Ayumu Saito Seiya Imoto Satoru Miy-ano

Some splicing isoform-specific transcriptionalregulations are related to disease Therefore de-tection of disease specific splice variations is thefirst step for finding disease specific transcrip-tional regulations Affymetrix Human Exon 10ST Array can measure exon-level expressionprofiles that are suitable to find differentially ex-pressed exons in genome-wide scale Howeverexon array produces massive datasets that aremore than we can handle and analyze on per-sonal computer We have developed ExonMiner

that is the first all-in-one web service for analy-sis of exon array data to detect transcripts thathave significantly different splicing patterns intwo cells eg normal and cancer cells Exon-Miner can perform the following analyses (1)data normalization (2) statistical analysis basedon two-way ANOVA (3) finding transcriptswith significantly different splice patterns (4) ef-ficient visualization based on heatmaps and bar-plots and (5) meta-analysis to detect exon levelbiomarkers We implemented ExonMiner on thesupercomputer system of Human Genome Cen-ter in order to perform genome-wide analysisfor more than 300000 transcripts in exon arraydata which has the potential to reveal the aber-rant splice variations in cancer cells as exonlevel biomarkers ExonMiner is well suited foranalysis of exon array data and does not requireany installation of software except for internetbrowsers The URL of ExonMiner is httpaehgcjpexonminer Users can analyze full datasetof exon array data within hours by high-levelstatistical analysis with sound theoretical basisthat finds aberrant splice variants as biomarkers

Publications

1 Ando T Konishi S Imoto S Nonlinear re-gression modeling via regularized radial ba-sis function networks Journal of StatisticalPlanning and Inference 138 (11) 3616-36332008

2 Brazma A Miyano S Akutsu T Proceed-ings of the 6th Asia-Pacific BioinformaticsConference (APBC 2008) Imperial CollegePress 2008

3 Do JH Miyano S The GC and window-averaged DNA curvature profile of secon-dary metabolite gene cluster in Aspergillusfumigatus genome Applied Microbiologyand Biotechnology 80 (5) 841-847 2008

4 Fujita A Gomes LR Sato JR Yama-guchi R Thomaz CE Sogayar MC Miy-ano S Multivariate gene expression analysisreveals functional connectivity changes be-tween normaltumoral prostates BMC Sys-tems Biology 2 106 2008

5 Fujita A Sato JR Garay-Malpartida HM Sogayar MC Ferreira CE Miyano SModeling nonlinear gene regulatory net-works from time series gene expressiondata J Bioinformatics and ComputationalBiology 6 (5) 961-979 2008

6 Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Fujita A Shima-mura T Tamada Y Imoto S KinoshitaK Nakai K Miyano S A novel strategy tosearch conserved transcription factor bind-

ing sites among coexpressing genes in hu-man Genome Informatics 20 212-221 2008

7 Hirose O Yoshida R Imoto S Yama-guchi R Higuchi T Charnock-Jones DSPrint C Miyano S Statistical inference oftranscriptional module-based gene networksfrom time course gene expression profiles byusing state space models Bioinformatics 24(7) 932-942 2008

8 Hirose O Yoshida R Yamaguchi RImoto S Higuchi T Miyano S Analyzingtime course gene expression data with bio-logical and technical replicates to estimategene networks by state space models Proc2nd Asia International Conference on Mod-elling amp Simulation 940-946 2008 (AMS2008 Refereed conference)

9 Jeong E Nagasaki M Miyano S Rule-based reasoning for system dynamics in cellsystems Genome Informatics 20 25-362008

10 Kitakaze H Kanda M Nakatsuka HIkeda N Matsuno H Miyano S Predic-tion of fragile points for robustness checkingof cell systems IEICE TRANSACTIONS onInformation and Systems D J91-D (9) 2404-2417 2008

11 Knapp E-W Benson G Holzhutter H-GKanehisa M Miyano S (Eds) Genome In-formatics 20 2008

12 Kojima K Fujita A Shimamura T Imoto

127

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 8: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

Eric Perrier Seiya Imoto Satoru Miyano

Conventional approaches for learning Baye-sian network structure from data have disad-vantages in terms of complexity and lower accu-racy of their results However a recent empiri-cal study has shown that a hybrid algorithm im-proves sensitively accuracy and speed it learnsa skeleton with an independency test (IT) ap-proach and constrains on the directed acyclicgraphs considered during the search-and-scorephase Subsequently we defined the structuralconstraint by introducing the concept of super-structure S which is an undirected graph thatrestricts the search to networks whose skeletonis a subgraph of S We developed a super-structure constrained optimal search (COS) itstime complexity is upper bounded by O(γm

n)where γm<2 depends on the maximal degree mof S Empirically complexity depends on theaverage degree mrsquo and sparse structures allowlarger graphs to be calculated Our algorithm isfaster than an optimal search by several ordersand even finds more accurate results whengiven a sound super-structure Practically S canbe approximated by IT approaches significancelevel of the tests controls its sparseness enablingto control the trade-off between speed and accu-racy For incomplete super-structures a greedilypost-processed version (COS+) still enables tosignificantly outperform other heuristic searches

c Statistical inference of transcriptionalmodule-based gene networks from timecourse gene expression profiles by usingstate space models

Osamu Hirose Ryo Yoshida1 Seiya Imoto RuiYamaguchi Tomoyuki Higuchi1 D StephenCharnock-Jones2 Cristin Print3 Satoru Miy-ano 1Institute of Statistical Mathematics 2Cambridge University 3University of Auck-land

We developed a novel method based on thestate space model to identify the transcriptionalmodules and module-based gene networks si-multaneously The state space model has the po-tential to infer large-scale gene networks eg oforder 103 from time-course gene expression pro-files Particularly we succeeded in identificationof a cell cycle system by using the gene expres-sion profiles of Saccharomyces cerevisiae in whichthe length of the time-course and number ofgenes were 24 and 4382 respectively Howeverwhen analyzing shorter time-course data eg oflength 10 or less the parameter estimations ofthe state space model often fail due to overfit-ting To extend the applicability of the state

space model we provided an approach to usethe technical replicates of gene expression pro-files which are often measured in duplicate ortriplicate The use of technical replicates is im-portant for achieving highly-efficient inferenceof gene networks with short time-course dataThe potential of the proposed method weredemonstrated through the time-course analysisof the gene expression profiles of human umbili-cal vein endothelial cells undergoing growthfactor deprivation-induced apoptosis

d Predicting differences in gene regulatorysystems by state space models

Rui Yamaguchi Seiya Imoto Mai YamauchiMasao Nagasaki Ryo Yoshida1 Teppei Shima-mura Yosuke Hatanaka Kazuko Ueno To-moyuki Higuchi1 Noriko Gotoh Satoru Miy-ano

We developed a statistical method to predictdifferentially regulated genes of case and controlsamples from time-course gene expression databy leveraging unpredictability of the expressionpatterns from the underlying regulatory systeminferred by a state space model The proposedmethod can screen out genes that show differentpatterns but generated by the same regulationsin both samples since these patterns can be pre-dicted by the same model Our strategy consistsof three steps Firstly a gene regulatory systemis inferred from the control data by a state spacemodel Then the obtained model for the under-lying regulatory system of the control sample isused to predict the case data Finally by assess-ing the significance of the difference betweencase and predicted-case time-course data of eachgene we are able to detect the unpredictablegenes that are the candidate as the key differ-ences between the regulatory systems of caseand control cells We illustrate the whole proc-ess of the strategy by an actual example wherehuman small airway epithelial cell gene regula-tory systems were generated from novel timecourses of gene expressions following treatmentwith(case)without(control) the drug gefitiniban inhibitor for the epidermal growth factor re-ceptor tyrosine kinase Finally in gefitinib re-sponse data we succeeded in finding unpredict-able genes that are candidates of the specific tar-gets of gefitinib We also discussed differencesin regulatory systems for the unpredictablegenes The proposed method would be a prom-ising tool for identifying biomarkers and drugtarget genes

e Bayesian learning of biological pathwayson genomic data assimilation

123

Ryo Yoshida1 Masao Nagasaki Rui Yama-guchi Seiya Imoto Satoru Miyano TomoyukiHiguchi1

Mathematical modeling and simulation basedon biochemical rate equations provide us a rig-orous tool for unraveling complex mechanismsof biological pathways To proceed to simulationexperiments it is an essential first step to findeffective values of model parameters which aredifficult to measure from in vivo and in vitro ex-periments Furthermore once a set of hypotheti-cal models has been created any statistical crite-rion is needed to test the ability of the con-structed models and to proceed to model revi-sion We developed a new statistical technologytowards data-driven construction of in silico bio-logical pathways The method starts with aknowledge-based modeling with hybrid func-tional Petri net It then proceeds to the Bayesianlearning of model parameters for which experi-mental data are available This process exploitsquantitative measurements of evolving bio-chemical reactions eg gene expression dataAnother important issue that we consider is sta-tistical evaluation and comparison of the con-structed hypothetical pathways For this pur-pose we have developed a new Bayesianinformation-theoretic measure that assesses thepredictability and the biological robustness of insilico pathways

f Modeling nonlinear gene regulatory net-works from time series gene expressiondata

Andreacute Fujita Joatildeo Ricardo Sato5 HumbertoMiguel Garay-Malpartida5 Mari CleideSogayar5 Carlow Eduardo Ferreira5 SatoruMiyano 5University of Satildeo Paulo

In cells molecular networks such as generegulatory networks are the basis of biologicalcomplexity Therefore gene regulatory networkshave become the core of research in systems bi-ology Understanding the processes underlyingthe several extracellular regulators signal trans-duction protein-protein interactions and differ-ential gene expression processes requires de-tailed molecular description of the protein andgene networks involved To understand betterthese complex molecular networks and to infernew regulatory associations we developed astatistical method based on vector autoregres-sive models and Granger causality to estimatenonlinear gene regulatory networks from timeseries microarray data Most of the modelsavailable in the literature assume linearity in theinference of gene connections moreover these

models do not infer directionality in these con-nections Thus a priori biological knowledge isrequired However in pathological cases no apriori biological information is available Toovercome these problems we present the non-linear vector autoregressive (NVAR) model Wehave applied the NVAR model to estimate non-linear gene regulatory networks based entirelyon gene expression profiles obtained from DNAmicroarray experiments We showed the resultsobtained by NVAR through several simulationsand by the construction of three actual generegulatory networks (p53 NF-κB and c-Myc)for HeLa cells

g Fast grid layout algorithm for biologicalnetworks with sweep calculation

Kaname Kojima Masao Nagasaki Satoru Miy-ano

Properly drawn biological networks are ofgreat help in the comprehension of their charac-teristics The quality of the layouts for retrievedbiological networks is critical for pathway data-bases However since it is unrealistic to manu-ally draw biological networks for every re-trieval automatic drawing algorithms are essen-tial Grid layout algorithms handle various bio-logical properties such as aligning vertices hav-ing the same attributes and complicated posi-tional constraints according to their subcellularlocalizations thus they succeed in providingbiologically comprehensible layouts Howeverexisting grid layout algorithms are not suitablefor real-time drawing which is one of requisitesfor applications to pathway databases due totheir high-computational cost In addition theydo not consider edge directions and their result-ing layouts lack traceability for biochemical re-actions and gene regulations which are themost important features in biological networksWe devised a new calculation method termedsweep calculation and reduced the time com-plexity of the current grid layout algorithmsthrough its encoding and decoding processesWe conduct ed practical experiments by using95 pathway models of various sizes fromTRANSPATH and showed that our new gridlayout algorithm is much faster than existinggrid layout algorithms For the cost function weintroduced a new component that penalizes un-desirable edge directions to avoid the lack oftraceability in pathways due to the differencesin direction between in-edges and out-edges ofeach vertex

124

h Estimation of nonlinear gene regulatorynetworks via L1 regularized NVAR fromtime series gene expression data

Kaname Kojima Andreacute Fujita Teppei Shima-mura Seiya Imoto Satoru Miyano

Recently nonlinear vector autoregressive(NVAR) model based on Granger causality wasproposed to infer nonlinear gene regulatory net-works from time series gene expression dataSince NVAR requires a large number of parame-ters due to the basis expansion the length oftime series microarray data is insufficient for ac-curate parameter estimation and we need tolimit the size of the gene set strongly To ad-dress this limitation we employed L1 regulariza-tion technique to estimate NVAR Under L1

regularization direct parents of each gene canbe selected efficiently even when the number ofparameters exceeds the number of data samplesWe can thus estimate larger gene regulatory net-works more accurately than those from existingmethods Through the simulation study weverified the effectiveness of the proposedmethod by comparing its limitation in the num-ber of genes to that of the existing NVAR Theproposed method was also applied to time se-ries microarray data of Human hela cell cycle

i Multivariate gene expression analysis re-veals functional connectivity changes be-tween normaltumoral prostates

Andreacute Fujita Luciana Rodrigues Gomes5 JoatildeoRicardo Sato6 Rui Yamaguchi Carlos Edu-ardo Thomaz7 Mari Cleide Sogayar5 SatoruMiyano 6Universidade Federal do ABC 7Cen-tro Universitaacuterio da FEI

Principal Component Analysis (PCA) com-bined with the Maximum-entropy Linear Dis-criminant Analysis (MLDA) was applied in or-der to identify genes with the most discrimina-tive information between normal and tumoralprostatic tissues Data analysis was carried outusing three different approaches namely (i) dif-ferences in gene expression levels between nor-mal and tumoral conditions from a univariatepoint of view (ii) in a multivariate fashion usingMLDA and (iii) with a dependence network ap-proach Our results show that malignant trans-formation in the prostatic tissue is more relatedto functional connectivity changes in their de-pendence networks than to differential gene ex-pression The MYLK KLK2 KLK3 HAN11LTF CSRP1 and TGM4 genes presented signifi-cant changes in their functional connectivity be-tween normal and tumoral conditions and were

also classified as the top seven most informativegenes for the prostate cancer genesis process byour discriminant analysis Moreover among theidentified genes we found classically knownbiomarkers and genes which are closely relatedto tumoral prostate such as KLK3 and KLK2and several other potential ones We have dem-onstrated that changes in functional connectivitymay be implicit in the biological process whichrenders some genes more informative to dis-criminate between normal and tumoral condi-tions Using the proposed method namelyMLDA in order to analyze the multivariatecharacteristic of genes it was possible to capturethe changes in dependence networks which arerelated to cell transformation

j Rule-based reasoning for system dynam-ics in cell systems

Euna Jeong Masao Nagasaki Satoru Miyano

A system-dynamics-centered ontology calledthe Cell System Ontology (CSO) has been de-veloped for representation of diverse biologicalpathways Many of the pathway data based onthe ontology have been created from databasesvia data conversion or curated by expert biolo-gists It is essential to validate the pathway datawhich may cause unexpected issues such as se-mantic inconsistency and incompleteness Thispaper discusses three criteria for validating thepathway data based on CSO as follows (1)structurally correct models in terms of Petrinets (2) biologically correct models to capturebiological meaning and (3) systematically cor-rect models to reflect biological behaviors Si-multaneously we have investigated how logic-based rules can be used for the ontology to ex-tend its expressiveness and to complement theontology by reasoning which aims at qualifyingpathway knowledge Finally we show how theproposed approach helps exploring dynamicmodeling and simulation tasks without priorknowledge

k A novel strategy to search conserved tran-scription factor binding sites among coex-pressing genes in human

Yosuke Hatanaka Masao Nagasaki Rui Yam-aguchi Takeshi Obayashi Kazuyuki NumataAndreacute Fujita Teppei Shimamura YoshinoriTamada Seiya Imoto Kengo Kinoshita KentaNakai Satoru Miyano

We reported various transcription factor bind-ing sites (TFBSs) conserved among co-expressedgenes in human promoter region using expres-

125

sion and genomic data Assuming similar pro-moter structure induces similar transcriptionalregulation hence induces similar expressionprofile we compared the promoter structuresimilarities between co-expressed genes Com-prehensive TF binding site predictions for allhuman genes were conducted for 19777 pro-moter regions around the transcription start site(TSS) given from DBTSS and promoter similar-ity search were conducted among coexpressinggenes data provided from newly developedCOXPRESdb Combination of Position WeightMatrix (PWM) motif prediction and bootstrapmethod 7313 genes have at least one statisti-cally significant conserved TFBS We also ap-plied basket method analysis for seeking combi-natorial activities of those conserved TFBSs

l Simulation analysis for the effect of light-dark cycle on the entrainment in circadianrhythm

Natumi Mitou8 Yuto Ikegami8 Hiroshi Mat-suno8 Satoru Miyano Shin-ichi T Inouye88Yamaguchi University

Circadian rhythms of the living organisms are24hr oscillations found in behavior biochemistryand physiology Under constant conditions therhythms continue with their intrinsic periodlength which are rarely exact 24hr In this pa-per we examine the effects of light on the phaseof the gene expression rhythms derived fromthe interacting feedback network of a few clockgenes taking advantage of a computer simula-tion with Cell Illustrator The simulation resultssuggested that the interacting circadian feedbacknetwork at the molecular level is essential forphase dependence of the light effects observedin mammalian behavior Furthermore the simu-lation reproduced the biological observationsthat the range of entrainment to shorter orlonger than 24hr light-dark cycles is limitedcentering around 24hr Application of our modelto inter-time zone flight successfully demon-strated that 6 to 7 days are required to recoverfrom jet lag when traveling from Tokyo to NewYork

2 Statistical and Computational KnowledgeDiscovery

a Nonlinear regression modeling via regular-ized radial basis function networks

Tomohiro Ando9 Sadanori Konishi10 SeiyaImoto 9Keio University 10Kyushu University

The problem of constructing nonlinear regres-

sion models is investigated to analyze data withcomplex structure We introduced radial basisfunctions with hyperparameter that adjusts theamount of overlapping basis functions andadopts the information of the input and re-sponse variables By using the radial basis func-tions we constructed nonlinear regression mod-els with help of the technique of regularizationCrucial issues in the model building process arethe choices of a hyperparameter the number ofbasis functions and a smoothing parameter Wepresent information-theoretic criteria for evaluat-ing statistical models under model misspecifica-tion both for distributional and structural as-sumptions We used real data examples andMonte Carlo simulations to investigate the prop-erties of the proposed nonlinear regression mod-eling techniques The simulation results showedthat our nonlinear modeling performs well invarious situations and clear improvements wereobtained for the use of the hyperparameter inthe basis functions

b The GC and window-averaged DNA curva-ture profile of secondary metabolite genecluster in Aspergillus fumigatus genome

Jin Hwan Do Satoru Miyano

An immense variety of complex secondarymetabolites is produced by filamentous fungi in-cluding Aspergillus fumigatus a main inducer ofinvasive aspergillosis The identification of fun-gal secondary metabolite gene cluster is essen-tial for the characterization of fungal secondarymetabolism in terms of genetics and biochemis-try through recombinant technologies such asgene disruption and cloning Most of the predic-tion methods for secondary metabolite genecluster severely depend on homology searchesHowever homology-based approach has intrin-sic limitation to unknown or novel gene clusterWe analyzed the GC and window-averagedDNA curvature profile of 26 secondary metabo-lite gene clusters in the A fumigatus genome tofind out potential conserved features of secon-dary metabolite gene cluster Fifteen secondarymetabolite gene clusters showed a conservedpattern in window-averaged DNA curvatureprofile that is the DNA regions including sec-ondary metabolic signature genes such aspolyketide synthase nonribosomal peptide syn-thase andor dimethylallyl tryptophan synthaseconsisted of window-averaged DNA curvaturevalues lower than 018 and these DNA regionswere at least 20 kb Forty percent of secondarymetabolite gene clusters with this conserved pat-tern were related to severe regulation by a tran-scription factor LaeA Our result could be used

126

for identification of other fungal secondary me-tabolite gene clusters especially for secondarymetabolite gene cluster that is severely regulatedby LaeA or other proteins with similar functionto LaeA

c ExonMiner Web service for analysis ofGeneChip exon array data

Kazuyuki Numata Ryo Yoshida1 Masao Na-gasaki Ayumu Saito Seiya Imoto Satoru Miy-ano

Some splicing isoform-specific transcriptionalregulations are related to disease Therefore de-tection of disease specific splice variations is thefirst step for finding disease specific transcrip-tional regulations Affymetrix Human Exon 10ST Array can measure exon-level expressionprofiles that are suitable to find differentially ex-pressed exons in genome-wide scale Howeverexon array produces massive datasets that aremore than we can handle and analyze on per-sonal computer We have developed ExonMiner

that is the first all-in-one web service for analy-sis of exon array data to detect transcripts thathave significantly different splicing patterns intwo cells eg normal and cancer cells Exon-Miner can perform the following analyses (1)data normalization (2) statistical analysis basedon two-way ANOVA (3) finding transcriptswith significantly different splice patterns (4) ef-ficient visualization based on heatmaps and bar-plots and (5) meta-analysis to detect exon levelbiomarkers We implemented ExonMiner on thesupercomputer system of Human Genome Cen-ter in order to perform genome-wide analysisfor more than 300000 transcripts in exon arraydata which has the potential to reveal the aber-rant splice variations in cancer cells as exonlevel biomarkers ExonMiner is well suited foranalysis of exon array data and does not requireany installation of software except for internetbrowsers The URL of ExonMiner is httpaehgcjpexonminer Users can analyze full datasetof exon array data within hours by high-levelstatistical analysis with sound theoretical basisthat finds aberrant splice variants as biomarkers

Publications

1 Ando T Konishi S Imoto S Nonlinear re-gression modeling via regularized radial ba-sis function networks Journal of StatisticalPlanning and Inference 138 (11) 3616-36332008

2 Brazma A Miyano S Akutsu T Proceed-ings of the 6th Asia-Pacific BioinformaticsConference (APBC 2008) Imperial CollegePress 2008

3 Do JH Miyano S The GC and window-averaged DNA curvature profile of secon-dary metabolite gene cluster in Aspergillusfumigatus genome Applied Microbiologyand Biotechnology 80 (5) 841-847 2008

4 Fujita A Gomes LR Sato JR Yama-guchi R Thomaz CE Sogayar MC Miy-ano S Multivariate gene expression analysisreveals functional connectivity changes be-tween normaltumoral prostates BMC Sys-tems Biology 2 106 2008

5 Fujita A Sato JR Garay-Malpartida HM Sogayar MC Ferreira CE Miyano SModeling nonlinear gene regulatory net-works from time series gene expressiondata J Bioinformatics and ComputationalBiology 6 (5) 961-979 2008

6 Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Fujita A Shima-mura T Tamada Y Imoto S KinoshitaK Nakai K Miyano S A novel strategy tosearch conserved transcription factor bind-

ing sites among coexpressing genes in hu-man Genome Informatics 20 212-221 2008

7 Hirose O Yoshida R Imoto S Yama-guchi R Higuchi T Charnock-Jones DSPrint C Miyano S Statistical inference oftranscriptional module-based gene networksfrom time course gene expression profiles byusing state space models Bioinformatics 24(7) 932-942 2008

8 Hirose O Yoshida R Yamaguchi RImoto S Higuchi T Miyano S Analyzingtime course gene expression data with bio-logical and technical replicates to estimategene networks by state space models Proc2nd Asia International Conference on Mod-elling amp Simulation 940-946 2008 (AMS2008 Refereed conference)

9 Jeong E Nagasaki M Miyano S Rule-based reasoning for system dynamics in cellsystems Genome Informatics 20 25-362008

10 Kitakaze H Kanda M Nakatsuka HIkeda N Matsuno H Miyano S Predic-tion of fragile points for robustness checkingof cell systems IEICE TRANSACTIONS onInformation and Systems D J91-D (9) 2404-2417 2008

11 Knapp E-W Benson G Holzhutter H-GKanehisa M Miyano S (Eds) Genome In-formatics 20 2008

12 Kojima K Fujita A Shimamura T Imoto

127

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 9: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

Ryo Yoshida1 Masao Nagasaki Rui Yama-guchi Seiya Imoto Satoru Miyano TomoyukiHiguchi1

Mathematical modeling and simulation basedon biochemical rate equations provide us a rig-orous tool for unraveling complex mechanismsof biological pathways To proceed to simulationexperiments it is an essential first step to findeffective values of model parameters which aredifficult to measure from in vivo and in vitro ex-periments Furthermore once a set of hypotheti-cal models has been created any statistical crite-rion is needed to test the ability of the con-structed models and to proceed to model revi-sion We developed a new statistical technologytowards data-driven construction of in silico bio-logical pathways The method starts with aknowledge-based modeling with hybrid func-tional Petri net It then proceeds to the Bayesianlearning of model parameters for which experi-mental data are available This process exploitsquantitative measurements of evolving bio-chemical reactions eg gene expression dataAnother important issue that we consider is sta-tistical evaluation and comparison of the con-structed hypothetical pathways For this pur-pose we have developed a new Bayesianinformation-theoretic measure that assesses thepredictability and the biological robustness of insilico pathways

f Modeling nonlinear gene regulatory net-works from time series gene expressiondata

Andreacute Fujita Joatildeo Ricardo Sato5 HumbertoMiguel Garay-Malpartida5 Mari CleideSogayar5 Carlow Eduardo Ferreira5 SatoruMiyano 5University of Satildeo Paulo

In cells molecular networks such as generegulatory networks are the basis of biologicalcomplexity Therefore gene regulatory networkshave become the core of research in systems bi-ology Understanding the processes underlyingthe several extracellular regulators signal trans-duction protein-protein interactions and differ-ential gene expression processes requires de-tailed molecular description of the protein andgene networks involved To understand betterthese complex molecular networks and to infernew regulatory associations we developed astatistical method based on vector autoregres-sive models and Granger causality to estimatenonlinear gene regulatory networks from timeseries microarray data Most of the modelsavailable in the literature assume linearity in theinference of gene connections moreover these

models do not infer directionality in these con-nections Thus a priori biological knowledge isrequired However in pathological cases no apriori biological information is available Toovercome these problems we present the non-linear vector autoregressive (NVAR) model Wehave applied the NVAR model to estimate non-linear gene regulatory networks based entirelyon gene expression profiles obtained from DNAmicroarray experiments We showed the resultsobtained by NVAR through several simulationsand by the construction of three actual generegulatory networks (p53 NF-κB and c-Myc)for HeLa cells

g Fast grid layout algorithm for biologicalnetworks with sweep calculation

Kaname Kojima Masao Nagasaki Satoru Miy-ano

Properly drawn biological networks are ofgreat help in the comprehension of their charac-teristics The quality of the layouts for retrievedbiological networks is critical for pathway data-bases However since it is unrealistic to manu-ally draw biological networks for every re-trieval automatic drawing algorithms are essen-tial Grid layout algorithms handle various bio-logical properties such as aligning vertices hav-ing the same attributes and complicated posi-tional constraints according to their subcellularlocalizations thus they succeed in providingbiologically comprehensible layouts Howeverexisting grid layout algorithms are not suitablefor real-time drawing which is one of requisitesfor applications to pathway databases due totheir high-computational cost In addition theydo not consider edge directions and their result-ing layouts lack traceability for biochemical re-actions and gene regulations which are themost important features in biological networksWe devised a new calculation method termedsweep calculation and reduced the time com-plexity of the current grid layout algorithmsthrough its encoding and decoding processesWe conduct ed practical experiments by using95 pathway models of various sizes fromTRANSPATH and showed that our new gridlayout algorithm is much faster than existinggrid layout algorithms For the cost function weintroduced a new component that penalizes un-desirable edge directions to avoid the lack oftraceability in pathways due to the differencesin direction between in-edges and out-edges ofeach vertex

124

h Estimation of nonlinear gene regulatorynetworks via L1 regularized NVAR fromtime series gene expression data

Kaname Kojima Andreacute Fujita Teppei Shima-mura Seiya Imoto Satoru Miyano

Recently nonlinear vector autoregressive(NVAR) model based on Granger causality wasproposed to infer nonlinear gene regulatory net-works from time series gene expression dataSince NVAR requires a large number of parame-ters due to the basis expansion the length oftime series microarray data is insufficient for ac-curate parameter estimation and we need tolimit the size of the gene set strongly To ad-dress this limitation we employed L1 regulariza-tion technique to estimate NVAR Under L1

regularization direct parents of each gene canbe selected efficiently even when the number ofparameters exceeds the number of data samplesWe can thus estimate larger gene regulatory net-works more accurately than those from existingmethods Through the simulation study weverified the effectiveness of the proposedmethod by comparing its limitation in the num-ber of genes to that of the existing NVAR Theproposed method was also applied to time se-ries microarray data of Human hela cell cycle

i Multivariate gene expression analysis re-veals functional connectivity changes be-tween normaltumoral prostates

Andreacute Fujita Luciana Rodrigues Gomes5 JoatildeoRicardo Sato6 Rui Yamaguchi Carlos Edu-ardo Thomaz7 Mari Cleide Sogayar5 SatoruMiyano 6Universidade Federal do ABC 7Cen-tro Universitaacuterio da FEI

Principal Component Analysis (PCA) com-bined with the Maximum-entropy Linear Dis-criminant Analysis (MLDA) was applied in or-der to identify genes with the most discrimina-tive information between normal and tumoralprostatic tissues Data analysis was carried outusing three different approaches namely (i) dif-ferences in gene expression levels between nor-mal and tumoral conditions from a univariatepoint of view (ii) in a multivariate fashion usingMLDA and (iii) with a dependence network ap-proach Our results show that malignant trans-formation in the prostatic tissue is more relatedto functional connectivity changes in their de-pendence networks than to differential gene ex-pression The MYLK KLK2 KLK3 HAN11LTF CSRP1 and TGM4 genes presented signifi-cant changes in their functional connectivity be-tween normal and tumoral conditions and were

also classified as the top seven most informativegenes for the prostate cancer genesis process byour discriminant analysis Moreover among theidentified genes we found classically knownbiomarkers and genes which are closely relatedto tumoral prostate such as KLK3 and KLK2and several other potential ones We have dem-onstrated that changes in functional connectivitymay be implicit in the biological process whichrenders some genes more informative to dis-criminate between normal and tumoral condi-tions Using the proposed method namelyMLDA in order to analyze the multivariatecharacteristic of genes it was possible to capturethe changes in dependence networks which arerelated to cell transformation

j Rule-based reasoning for system dynam-ics in cell systems

Euna Jeong Masao Nagasaki Satoru Miyano

A system-dynamics-centered ontology calledthe Cell System Ontology (CSO) has been de-veloped for representation of diverse biologicalpathways Many of the pathway data based onthe ontology have been created from databasesvia data conversion or curated by expert biolo-gists It is essential to validate the pathway datawhich may cause unexpected issues such as se-mantic inconsistency and incompleteness Thispaper discusses three criteria for validating thepathway data based on CSO as follows (1)structurally correct models in terms of Petrinets (2) biologically correct models to capturebiological meaning and (3) systematically cor-rect models to reflect biological behaviors Si-multaneously we have investigated how logic-based rules can be used for the ontology to ex-tend its expressiveness and to complement theontology by reasoning which aims at qualifyingpathway knowledge Finally we show how theproposed approach helps exploring dynamicmodeling and simulation tasks without priorknowledge

k A novel strategy to search conserved tran-scription factor binding sites among coex-pressing genes in human

Yosuke Hatanaka Masao Nagasaki Rui Yam-aguchi Takeshi Obayashi Kazuyuki NumataAndreacute Fujita Teppei Shimamura YoshinoriTamada Seiya Imoto Kengo Kinoshita KentaNakai Satoru Miyano

We reported various transcription factor bind-ing sites (TFBSs) conserved among co-expressedgenes in human promoter region using expres-

125

sion and genomic data Assuming similar pro-moter structure induces similar transcriptionalregulation hence induces similar expressionprofile we compared the promoter structuresimilarities between co-expressed genes Com-prehensive TF binding site predictions for allhuman genes were conducted for 19777 pro-moter regions around the transcription start site(TSS) given from DBTSS and promoter similar-ity search were conducted among coexpressinggenes data provided from newly developedCOXPRESdb Combination of Position WeightMatrix (PWM) motif prediction and bootstrapmethod 7313 genes have at least one statisti-cally significant conserved TFBS We also ap-plied basket method analysis for seeking combi-natorial activities of those conserved TFBSs

l Simulation analysis for the effect of light-dark cycle on the entrainment in circadianrhythm

Natumi Mitou8 Yuto Ikegami8 Hiroshi Mat-suno8 Satoru Miyano Shin-ichi T Inouye88Yamaguchi University

Circadian rhythms of the living organisms are24hr oscillations found in behavior biochemistryand physiology Under constant conditions therhythms continue with their intrinsic periodlength which are rarely exact 24hr In this pa-per we examine the effects of light on the phaseof the gene expression rhythms derived fromthe interacting feedback network of a few clockgenes taking advantage of a computer simula-tion with Cell Illustrator The simulation resultssuggested that the interacting circadian feedbacknetwork at the molecular level is essential forphase dependence of the light effects observedin mammalian behavior Furthermore the simu-lation reproduced the biological observationsthat the range of entrainment to shorter orlonger than 24hr light-dark cycles is limitedcentering around 24hr Application of our modelto inter-time zone flight successfully demon-strated that 6 to 7 days are required to recoverfrom jet lag when traveling from Tokyo to NewYork

2 Statistical and Computational KnowledgeDiscovery

a Nonlinear regression modeling via regular-ized radial basis function networks

Tomohiro Ando9 Sadanori Konishi10 SeiyaImoto 9Keio University 10Kyushu University

The problem of constructing nonlinear regres-

sion models is investigated to analyze data withcomplex structure We introduced radial basisfunctions with hyperparameter that adjusts theamount of overlapping basis functions andadopts the information of the input and re-sponse variables By using the radial basis func-tions we constructed nonlinear regression mod-els with help of the technique of regularizationCrucial issues in the model building process arethe choices of a hyperparameter the number ofbasis functions and a smoothing parameter Wepresent information-theoretic criteria for evaluat-ing statistical models under model misspecifica-tion both for distributional and structural as-sumptions We used real data examples andMonte Carlo simulations to investigate the prop-erties of the proposed nonlinear regression mod-eling techniques The simulation results showedthat our nonlinear modeling performs well invarious situations and clear improvements wereobtained for the use of the hyperparameter inthe basis functions

b The GC and window-averaged DNA curva-ture profile of secondary metabolite genecluster in Aspergillus fumigatus genome

Jin Hwan Do Satoru Miyano

An immense variety of complex secondarymetabolites is produced by filamentous fungi in-cluding Aspergillus fumigatus a main inducer ofinvasive aspergillosis The identification of fun-gal secondary metabolite gene cluster is essen-tial for the characterization of fungal secondarymetabolism in terms of genetics and biochemis-try through recombinant technologies such asgene disruption and cloning Most of the predic-tion methods for secondary metabolite genecluster severely depend on homology searchesHowever homology-based approach has intrin-sic limitation to unknown or novel gene clusterWe analyzed the GC and window-averagedDNA curvature profile of 26 secondary metabo-lite gene clusters in the A fumigatus genome tofind out potential conserved features of secon-dary metabolite gene cluster Fifteen secondarymetabolite gene clusters showed a conservedpattern in window-averaged DNA curvatureprofile that is the DNA regions including sec-ondary metabolic signature genes such aspolyketide synthase nonribosomal peptide syn-thase andor dimethylallyl tryptophan synthaseconsisted of window-averaged DNA curvaturevalues lower than 018 and these DNA regionswere at least 20 kb Forty percent of secondarymetabolite gene clusters with this conserved pat-tern were related to severe regulation by a tran-scription factor LaeA Our result could be used

126

for identification of other fungal secondary me-tabolite gene clusters especially for secondarymetabolite gene cluster that is severely regulatedby LaeA or other proteins with similar functionto LaeA

c ExonMiner Web service for analysis ofGeneChip exon array data

Kazuyuki Numata Ryo Yoshida1 Masao Na-gasaki Ayumu Saito Seiya Imoto Satoru Miy-ano

Some splicing isoform-specific transcriptionalregulations are related to disease Therefore de-tection of disease specific splice variations is thefirst step for finding disease specific transcrip-tional regulations Affymetrix Human Exon 10ST Array can measure exon-level expressionprofiles that are suitable to find differentially ex-pressed exons in genome-wide scale Howeverexon array produces massive datasets that aremore than we can handle and analyze on per-sonal computer We have developed ExonMiner

that is the first all-in-one web service for analy-sis of exon array data to detect transcripts thathave significantly different splicing patterns intwo cells eg normal and cancer cells Exon-Miner can perform the following analyses (1)data normalization (2) statistical analysis basedon two-way ANOVA (3) finding transcriptswith significantly different splice patterns (4) ef-ficient visualization based on heatmaps and bar-plots and (5) meta-analysis to detect exon levelbiomarkers We implemented ExonMiner on thesupercomputer system of Human Genome Cen-ter in order to perform genome-wide analysisfor more than 300000 transcripts in exon arraydata which has the potential to reveal the aber-rant splice variations in cancer cells as exonlevel biomarkers ExonMiner is well suited foranalysis of exon array data and does not requireany installation of software except for internetbrowsers The URL of ExonMiner is httpaehgcjpexonminer Users can analyze full datasetof exon array data within hours by high-levelstatistical analysis with sound theoretical basisthat finds aberrant splice variants as biomarkers

Publications

1 Ando T Konishi S Imoto S Nonlinear re-gression modeling via regularized radial ba-sis function networks Journal of StatisticalPlanning and Inference 138 (11) 3616-36332008

2 Brazma A Miyano S Akutsu T Proceed-ings of the 6th Asia-Pacific BioinformaticsConference (APBC 2008) Imperial CollegePress 2008

3 Do JH Miyano S The GC and window-averaged DNA curvature profile of secon-dary metabolite gene cluster in Aspergillusfumigatus genome Applied Microbiologyand Biotechnology 80 (5) 841-847 2008

4 Fujita A Gomes LR Sato JR Yama-guchi R Thomaz CE Sogayar MC Miy-ano S Multivariate gene expression analysisreveals functional connectivity changes be-tween normaltumoral prostates BMC Sys-tems Biology 2 106 2008

5 Fujita A Sato JR Garay-Malpartida HM Sogayar MC Ferreira CE Miyano SModeling nonlinear gene regulatory net-works from time series gene expressiondata J Bioinformatics and ComputationalBiology 6 (5) 961-979 2008

6 Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Fujita A Shima-mura T Tamada Y Imoto S KinoshitaK Nakai K Miyano S A novel strategy tosearch conserved transcription factor bind-

ing sites among coexpressing genes in hu-man Genome Informatics 20 212-221 2008

7 Hirose O Yoshida R Imoto S Yama-guchi R Higuchi T Charnock-Jones DSPrint C Miyano S Statistical inference oftranscriptional module-based gene networksfrom time course gene expression profiles byusing state space models Bioinformatics 24(7) 932-942 2008

8 Hirose O Yoshida R Yamaguchi RImoto S Higuchi T Miyano S Analyzingtime course gene expression data with bio-logical and technical replicates to estimategene networks by state space models Proc2nd Asia International Conference on Mod-elling amp Simulation 940-946 2008 (AMS2008 Refereed conference)

9 Jeong E Nagasaki M Miyano S Rule-based reasoning for system dynamics in cellsystems Genome Informatics 20 25-362008

10 Kitakaze H Kanda M Nakatsuka HIkeda N Matsuno H Miyano S Predic-tion of fragile points for robustness checkingof cell systems IEICE TRANSACTIONS onInformation and Systems D J91-D (9) 2404-2417 2008

11 Knapp E-W Benson G Holzhutter H-GKanehisa M Miyano S (Eds) Genome In-formatics 20 2008

12 Kojima K Fujita A Shimamura T Imoto

127

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 10: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

h Estimation of nonlinear gene regulatorynetworks via L1 regularized NVAR fromtime series gene expression data

Kaname Kojima Andreacute Fujita Teppei Shima-mura Seiya Imoto Satoru Miyano

Recently nonlinear vector autoregressive(NVAR) model based on Granger causality wasproposed to infer nonlinear gene regulatory net-works from time series gene expression dataSince NVAR requires a large number of parame-ters due to the basis expansion the length oftime series microarray data is insufficient for ac-curate parameter estimation and we need tolimit the size of the gene set strongly To ad-dress this limitation we employed L1 regulariza-tion technique to estimate NVAR Under L1

regularization direct parents of each gene canbe selected efficiently even when the number ofparameters exceeds the number of data samplesWe can thus estimate larger gene regulatory net-works more accurately than those from existingmethods Through the simulation study weverified the effectiveness of the proposedmethod by comparing its limitation in the num-ber of genes to that of the existing NVAR Theproposed method was also applied to time se-ries microarray data of Human hela cell cycle

i Multivariate gene expression analysis re-veals functional connectivity changes be-tween normaltumoral prostates

Andreacute Fujita Luciana Rodrigues Gomes5 JoatildeoRicardo Sato6 Rui Yamaguchi Carlos Edu-ardo Thomaz7 Mari Cleide Sogayar5 SatoruMiyano 6Universidade Federal do ABC 7Cen-tro Universitaacuterio da FEI

Principal Component Analysis (PCA) com-bined with the Maximum-entropy Linear Dis-criminant Analysis (MLDA) was applied in or-der to identify genes with the most discrimina-tive information between normal and tumoralprostatic tissues Data analysis was carried outusing three different approaches namely (i) dif-ferences in gene expression levels between nor-mal and tumoral conditions from a univariatepoint of view (ii) in a multivariate fashion usingMLDA and (iii) with a dependence network ap-proach Our results show that malignant trans-formation in the prostatic tissue is more relatedto functional connectivity changes in their de-pendence networks than to differential gene ex-pression The MYLK KLK2 KLK3 HAN11LTF CSRP1 and TGM4 genes presented signifi-cant changes in their functional connectivity be-tween normal and tumoral conditions and were

also classified as the top seven most informativegenes for the prostate cancer genesis process byour discriminant analysis Moreover among theidentified genes we found classically knownbiomarkers and genes which are closely relatedto tumoral prostate such as KLK3 and KLK2and several other potential ones We have dem-onstrated that changes in functional connectivitymay be implicit in the biological process whichrenders some genes more informative to dis-criminate between normal and tumoral condi-tions Using the proposed method namelyMLDA in order to analyze the multivariatecharacteristic of genes it was possible to capturethe changes in dependence networks which arerelated to cell transformation

j Rule-based reasoning for system dynam-ics in cell systems

Euna Jeong Masao Nagasaki Satoru Miyano

A system-dynamics-centered ontology calledthe Cell System Ontology (CSO) has been de-veloped for representation of diverse biologicalpathways Many of the pathway data based onthe ontology have been created from databasesvia data conversion or curated by expert biolo-gists It is essential to validate the pathway datawhich may cause unexpected issues such as se-mantic inconsistency and incompleteness Thispaper discusses three criteria for validating thepathway data based on CSO as follows (1)structurally correct models in terms of Petrinets (2) biologically correct models to capturebiological meaning and (3) systematically cor-rect models to reflect biological behaviors Si-multaneously we have investigated how logic-based rules can be used for the ontology to ex-tend its expressiveness and to complement theontology by reasoning which aims at qualifyingpathway knowledge Finally we show how theproposed approach helps exploring dynamicmodeling and simulation tasks without priorknowledge

k A novel strategy to search conserved tran-scription factor binding sites among coex-pressing genes in human

Yosuke Hatanaka Masao Nagasaki Rui Yam-aguchi Takeshi Obayashi Kazuyuki NumataAndreacute Fujita Teppei Shimamura YoshinoriTamada Seiya Imoto Kengo Kinoshita KentaNakai Satoru Miyano

We reported various transcription factor bind-ing sites (TFBSs) conserved among co-expressedgenes in human promoter region using expres-

125

sion and genomic data Assuming similar pro-moter structure induces similar transcriptionalregulation hence induces similar expressionprofile we compared the promoter structuresimilarities between co-expressed genes Com-prehensive TF binding site predictions for allhuman genes were conducted for 19777 pro-moter regions around the transcription start site(TSS) given from DBTSS and promoter similar-ity search were conducted among coexpressinggenes data provided from newly developedCOXPRESdb Combination of Position WeightMatrix (PWM) motif prediction and bootstrapmethod 7313 genes have at least one statisti-cally significant conserved TFBS We also ap-plied basket method analysis for seeking combi-natorial activities of those conserved TFBSs

l Simulation analysis for the effect of light-dark cycle on the entrainment in circadianrhythm

Natumi Mitou8 Yuto Ikegami8 Hiroshi Mat-suno8 Satoru Miyano Shin-ichi T Inouye88Yamaguchi University

Circadian rhythms of the living organisms are24hr oscillations found in behavior biochemistryand physiology Under constant conditions therhythms continue with their intrinsic periodlength which are rarely exact 24hr In this pa-per we examine the effects of light on the phaseof the gene expression rhythms derived fromthe interacting feedback network of a few clockgenes taking advantage of a computer simula-tion with Cell Illustrator The simulation resultssuggested that the interacting circadian feedbacknetwork at the molecular level is essential forphase dependence of the light effects observedin mammalian behavior Furthermore the simu-lation reproduced the biological observationsthat the range of entrainment to shorter orlonger than 24hr light-dark cycles is limitedcentering around 24hr Application of our modelto inter-time zone flight successfully demon-strated that 6 to 7 days are required to recoverfrom jet lag when traveling from Tokyo to NewYork

2 Statistical and Computational KnowledgeDiscovery

a Nonlinear regression modeling via regular-ized radial basis function networks

Tomohiro Ando9 Sadanori Konishi10 SeiyaImoto 9Keio University 10Kyushu University

The problem of constructing nonlinear regres-

sion models is investigated to analyze data withcomplex structure We introduced radial basisfunctions with hyperparameter that adjusts theamount of overlapping basis functions andadopts the information of the input and re-sponse variables By using the radial basis func-tions we constructed nonlinear regression mod-els with help of the technique of regularizationCrucial issues in the model building process arethe choices of a hyperparameter the number ofbasis functions and a smoothing parameter Wepresent information-theoretic criteria for evaluat-ing statistical models under model misspecifica-tion both for distributional and structural as-sumptions We used real data examples andMonte Carlo simulations to investigate the prop-erties of the proposed nonlinear regression mod-eling techniques The simulation results showedthat our nonlinear modeling performs well invarious situations and clear improvements wereobtained for the use of the hyperparameter inthe basis functions

b The GC and window-averaged DNA curva-ture profile of secondary metabolite genecluster in Aspergillus fumigatus genome

Jin Hwan Do Satoru Miyano

An immense variety of complex secondarymetabolites is produced by filamentous fungi in-cluding Aspergillus fumigatus a main inducer ofinvasive aspergillosis The identification of fun-gal secondary metabolite gene cluster is essen-tial for the characterization of fungal secondarymetabolism in terms of genetics and biochemis-try through recombinant technologies such asgene disruption and cloning Most of the predic-tion methods for secondary metabolite genecluster severely depend on homology searchesHowever homology-based approach has intrin-sic limitation to unknown or novel gene clusterWe analyzed the GC and window-averagedDNA curvature profile of 26 secondary metabo-lite gene clusters in the A fumigatus genome tofind out potential conserved features of secon-dary metabolite gene cluster Fifteen secondarymetabolite gene clusters showed a conservedpattern in window-averaged DNA curvatureprofile that is the DNA regions including sec-ondary metabolic signature genes such aspolyketide synthase nonribosomal peptide syn-thase andor dimethylallyl tryptophan synthaseconsisted of window-averaged DNA curvaturevalues lower than 018 and these DNA regionswere at least 20 kb Forty percent of secondarymetabolite gene clusters with this conserved pat-tern were related to severe regulation by a tran-scription factor LaeA Our result could be used

126

for identification of other fungal secondary me-tabolite gene clusters especially for secondarymetabolite gene cluster that is severely regulatedby LaeA or other proteins with similar functionto LaeA

c ExonMiner Web service for analysis ofGeneChip exon array data

Kazuyuki Numata Ryo Yoshida1 Masao Na-gasaki Ayumu Saito Seiya Imoto Satoru Miy-ano

Some splicing isoform-specific transcriptionalregulations are related to disease Therefore de-tection of disease specific splice variations is thefirst step for finding disease specific transcrip-tional regulations Affymetrix Human Exon 10ST Array can measure exon-level expressionprofiles that are suitable to find differentially ex-pressed exons in genome-wide scale Howeverexon array produces massive datasets that aremore than we can handle and analyze on per-sonal computer We have developed ExonMiner

that is the first all-in-one web service for analy-sis of exon array data to detect transcripts thathave significantly different splicing patterns intwo cells eg normal and cancer cells Exon-Miner can perform the following analyses (1)data normalization (2) statistical analysis basedon two-way ANOVA (3) finding transcriptswith significantly different splice patterns (4) ef-ficient visualization based on heatmaps and bar-plots and (5) meta-analysis to detect exon levelbiomarkers We implemented ExonMiner on thesupercomputer system of Human Genome Cen-ter in order to perform genome-wide analysisfor more than 300000 transcripts in exon arraydata which has the potential to reveal the aber-rant splice variations in cancer cells as exonlevel biomarkers ExonMiner is well suited foranalysis of exon array data and does not requireany installation of software except for internetbrowsers The URL of ExonMiner is httpaehgcjpexonminer Users can analyze full datasetof exon array data within hours by high-levelstatistical analysis with sound theoretical basisthat finds aberrant splice variants as biomarkers

Publications

1 Ando T Konishi S Imoto S Nonlinear re-gression modeling via regularized radial ba-sis function networks Journal of StatisticalPlanning and Inference 138 (11) 3616-36332008

2 Brazma A Miyano S Akutsu T Proceed-ings of the 6th Asia-Pacific BioinformaticsConference (APBC 2008) Imperial CollegePress 2008

3 Do JH Miyano S The GC and window-averaged DNA curvature profile of secon-dary metabolite gene cluster in Aspergillusfumigatus genome Applied Microbiologyand Biotechnology 80 (5) 841-847 2008

4 Fujita A Gomes LR Sato JR Yama-guchi R Thomaz CE Sogayar MC Miy-ano S Multivariate gene expression analysisreveals functional connectivity changes be-tween normaltumoral prostates BMC Sys-tems Biology 2 106 2008

5 Fujita A Sato JR Garay-Malpartida HM Sogayar MC Ferreira CE Miyano SModeling nonlinear gene regulatory net-works from time series gene expressiondata J Bioinformatics and ComputationalBiology 6 (5) 961-979 2008

6 Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Fujita A Shima-mura T Tamada Y Imoto S KinoshitaK Nakai K Miyano S A novel strategy tosearch conserved transcription factor bind-

ing sites among coexpressing genes in hu-man Genome Informatics 20 212-221 2008

7 Hirose O Yoshida R Imoto S Yama-guchi R Higuchi T Charnock-Jones DSPrint C Miyano S Statistical inference oftranscriptional module-based gene networksfrom time course gene expression profiles byusing state space models Bioinformatics 24(7) 932-942 2008

8 Hirose O Yoshida R Yamaguchi RImoto S Higuchi T Miyano S Analyzingtime course gene expression data with bio-logical and technical replicates to estimategene networks by state space models Proc2nd Asia International Conference on Mod-elling amp Simulation 940-946 2008 (AMS2008 Refereed conference)

9 Jeong E Nagasaki M Miyano S Rule-based reasoning for system dynamics in cellsystems Genome Informatics 20 25-362008

10 Kitakaze H Kanda M Nakatsuka HIkeda N Matsuno H Miyano S Predic-tion of fragile points for robustness checkingof cell systems IEICE TRANSACTIONS onInformation and Systems D J91-D (9) 2404-2417 2008

11 Knapp E-W Benson G Holzhutter H-GKanehisa M Miyano S (Eds) Genome In-formatics 20 2008

12 Kojima K Fujita A Shimamura T Imoto

127

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 11: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

sion and genomic data Assuming similar pro-moter structure induces similar transcriptionalregulation hence induces similar expressionprofile we compared the promoter structuresimilarities between co-expressed genes Com-prehensive TF binding site predictions for allhuman genes were conducted for 19777 pro-moter regions around the transcription start site(TSS) given from DBTSS and promoter similar-ity search were conducted among coexpressinggenes data provided from newly developedCOXPRESdb Combination of Position WeightMatrix (PWM) motif prediction and bootstrapmethod 7313 genes have at least one statisti-cally significant conserved TFBS We also ap-plied basket method analysis for seeking combi-natorial activities of those conserved TFBSs

l Simulation analysis for the effect of light-dark cycle on the entrainment in circadianrhythm

Natumi Mitou8 Yuto Ikegami8 Hiroshi Mat-suno8 Satoru Miyano Shin-ichi T Inouye88Yamaguchi University

Circadian rhythms of the living organisms are24hr oscillations found in behavior biochemistryand physiology Under constant conditions therhythms continue with their intrinsic periodlength which are rarely exact 24hr In this pa-per we examine the effects of light on the phaseof the gene expression rhythms derived fromthe interacting feedback network of a few clockgenes taking advantage of a computer simula-tion with Cell Illustrator The simulation resultssuggested that the interacting circadian feedbacknetwork at the molecular level is essential forphase dependence of the light effects observedin mammalian behavior Furthermore the simu-lation reproduced the biological observationsthat the range of entrainment to shorter orlonger than 24hr light-dark cycles is limitedcentering around 24hr Application of our modelto inter-time zone flight successfully demon-strated that 6 to 7 days are required to recoverfrom jet lag when traveling from Tokyo to NewYork

2 Statistical and Computational KnowledgeDiscovery

a Nonlinear regression modeling via regular-ized radial basis function networks

Tomohiro Ando9 Sadanori Konishi10 SeiyaImoto 9Keio University 10Kyushu University

The problem of constructing nonlinear regres-

sion models is investigated to analyze data withcomplex structure We introduced radial basisfunctions with hyperparameter that adjusts theamount of overlapping basis functions andadopts the information of the input and re-sponse variables By using the radial basis func-tions we constructed nonlinear regression mod-els with help of the technique of regularizationCrucial issues in the model building process arethe choices of a hyperparameter the number ofbasis functions and a smoothing parameter Wepresent information-theoretic criteria for evaluat-ing statistical models under model misspecifica-tion both for distributional and structural as-sumptions We used real data examples andMonte Carlo simulations to investigate the prop-erties of the proposed nonlinear regression mod-eling techniques The simulation results showedthat our nonlinear modeling performs well invarious situations and clear improvements wereobtained for the use of the hyperparameter inthe basis functions

b The GC and window-averaged DNA curva-ture profile of secondary metabolite genecluster in Aspergillus fumigatus genome

Jin Hwan Do Satoru Miyano

An immense variety of complex secondarymetabolites is produced by filamentous fungi in-cluding Aspergillus fumigatus a main inducer ofinvasive aspergillosis The identification of fun-gal secondary metabolite gene cluster is essen-tial for the characterization of fungal secondarymetabolism in terms of genetics and biochemis-try through recombinant technologies such asgene disruption and cloning Most of the predic-tion methods for secondary metabolite genecluster severely depend on homology searchesHowever homology-based approach has intrin-sic limitation to unknown or novel gene clusterWe analyzed the GC and window-averagedDNA curvature profile of 26 secondary metabo-lite gene clusters in the A fumigatus genome tofind out potential conserved features of secon-dary metabolite gene cluster Fifteen secondarymetabolite gene clusters showed a conservedpattern in window-averaged DNA curvatureprofile that is the DNA regions including sec-ondary metabolic signature genes such aspolyketide synthase nonribosomal peptide syn-thase andor dimethylallyl tryptophan synthaseconsisted of window-averaged DNA curvaturevalues lower than 018 and these DNA regionswere at least 20 kb Forty percent of secondarymetabolite gene clusters with this conserved pat-tern were related to severe regulation by a tran-scription factor LaeA Our result could be used

126

for identification of other fungal secondary me-tabolite gene clusters especially for secondarymetabolite gene cluster that is severely regulatedby LaeA or other proteins with similar functionto LaeA

c ExonMiner Web service for analysis ofGeneChip exon array data

Kazuyuki Numata Ryo Yoshida1 Masao Na-gasaki Ayumu Saito Seiya Imoto Satoru Miy-ano

Some splicing isoform-specific transcriptionalregulations are related to disease Therefore de-tection of disease specific splice variations is thefirst step for finding disease specific transcrip-tional regulations Affymetrix Human Exon 10ST Array can measure exon-level expressionprofiles that are suitable to find differentially ex-pressed exons in genome-wide scale Howeverexon array produces massive datasets that aremore than we can handle and analyze on per-sonal computer We have developed ExonMiner

that is the first all-in-one web service for analy-sis of exon array data to detect transcripts thathave significantly different splicing patterns intwo cells eg normal and cancer cells Exon-Miner can perform the following analyses (1)data normalization (2) statistical analysis basedon two-way ANOVA (3) finding transcriptswith significantly different splice patterns (4) ef-ficient visualization based on heatmaps and bar-plots and (5) meta-analysis to detect exon levelbiomarkers We implemented ExonMiner on thesupercomputer system of Human Genome Cen-ter in order to perform genome-wide analysisfor more than 300000 transcripts in exon arraydata which has the potential to reveal the aber-rant splice variations in cancer cells as exonlevel biomarkers ExonMiner is well suited foranalysis of exon array data and does not requireany installation of software except for internetbrowsers The URL of ExonMiner is httpaehgcjpexonminer Users can analyze full datasetof exon array data within hours by high-levelstatistical analysis with sound theoretical basisthat finds aberrant splice variants as biomarkers

Publications

1 Ando T Konishi S Imoto S Nonlinear re-gression modeling via regularized radial ba-sis function networks Journal of StatisticalPlanning and Inference 138 (11) 3616-36332008

2 Brazma A Miyano S Akutsu T Proceed-ings of the 6th Asia-Pacific BioinformaticsConference (APBC 2008) Imperial CollegePress 2008

3 Do JH Miyano S The GC and window-averaged DNA curvature profile of secon-dary metabolite gene cluster in Aspergillusfumigatus genome Applied Microbiologyand Biotechnology 80 (5) 841-847 2008

4 Fujita A Gomes LR Sato JR Yama-guchi R Thomaz CE Sogayar MC Miy-ano S Multivariate gene expression analysisreveals functional connectivity changes be-tween normaltumoral prostates BMC Sys-tems Biology 2 106 2008

5 Fujita A Sato JR Garay-Malpartida HM Sogayar MC Ferreira CE Miyano SModeling nonlinear gene regulatory net-works from time series gene expressiondata J Bioinformatics and ComputationalBiology 6 (5) 961-979 2008

6 Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Fujita A Shima-mura T Tamada Y Imoto S KinoshitaK Nakai K Miyano S A novel strategy tosearch conserved transcription factor bind-

ing sites among coexpressing genes in hu-man Genome Informatics 20 212-221 2008

7 Hirose O Yoshida R Imoto S Yama-guchi R Higuchi T Charnock-Jones DSPrint C Miyano S Statistical inference oftranscriptional module-based gene networksfrom time course gene expression profiles byusing state space models Bioinformatics 24(7) 932-942 2008

8 Hirose O Yoshida R Yamaguchi RImoto S Higuchi T Miyano S Analyzingtime course gene expression data with bio-logical and technical replicates to estimategene networks by state space models Proc2nd Asia International Conference on Mod-elling amp Simulation 940-946 2008 (AMS2008 Refereed conference)

9 Jeong E Nagasaki M Miyano S Rule-based reasoning for system dynamics in cellsystems Genome Informatics 20 25-362008

10 Kitakaze H Kanda M Nakatsuka HIkeda N Matsuno H Miyano S Predic-tion of fragile points for robustness checkingof cell systems IEICE TRANSACTIONS onInformation and Systems D J91-D (9) 2404-2417 2008

11 Knapp E-W Benson G Holzhutter H-GKanehisa M Miyano S (Eds) Genome In-formatics 20 2008

12 Kojima K Fujita A Shimamura T Imoto

127

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 12: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

for identification of other fungal secondary me-tabolite gene clusters especially for secondarymetabolite gene cluster that is severely regulatedby LaeA or other proteins with similar functionto LaeA

c ExonMiner Web service for analysis ofGeneChip exon array data

Kazuyuki Numata Ryo Yoshida1 Masao Na-gasaki Ayumu Saito Seiya Imoto Satoru Miy-ano

Some splicing isoform-specific transcriptionalregulations are related to disease Therefore de-tection of disease specific splice variations is thefirst step for finding disease specific transcrip-tional regulations Affymetrix Human Exon 10ST Array can measure exon-level expressionprofiles that are suitable to find differentially ex-pressed exons in genome-wide scale Howeverexon array produces massive datasets that aremore than we can handle and analyze on per-sonal computer We have developed ExonMiner

that is the first all-in-one web service for analy-sis of exon array data to detect transcripts thathave significantly different splicing patterns intwo cells eg normal and cancer cells Exon-Miner can perform the following analyses (1)data normalization (2) statistical analysis basedon two-way ANOVA (3) finding transcriptswith significantly different splice patterns (4) ef-ficient visualization based on heatmaps and bar-plots and (5) meta-analysis to detect exon levelbiomarkers We implemented ExonMiner on thesupercomputer system of Human Genome Cen-ter in order to perform genome-wide analysisfor more than 300000 transcripts in exon arraydata which has the potential to reveal the aber-rant splice variations in cancer cells as exonlevel biomarkers ExonMiner is well suited foranalysis of exon array data and does not requireany installation of software except for internetbrowsers The URL of ExonMiner is httpaehgcjpexonminer Users can analyze full datasetof exon array data within hours by high-levelstatistical analysis with sound theoretical basisthat finds aberrant splice variants as biomarkers

Publications

1 Ando T Konishi S Imoto S Nonlinear re-gression modeling via regularized radial ba-sis function networks Journal of StatisticalPlanning and Inference 138 (11) 3616-36332008

2 Brazma A Miyano S Akutsu T Proceed-ings of the 6th Asia-Pacific BioinformaticsConference (APBC 2008) Imperial CollegePress 2008

3 Do JH Miyano S The GC and window-averaged DNA curvature profile of secon-dary metabolite gene cluster in Aspergillusfumigatus genome Applied Microbiologyand Biotechnology 80 (5) 841-847 2008

4 Fujita A Gomes LR Sato JR Yama-guchi R Thomaz CE Sogayar MC Miy-ano S Multivariate gene expression analysisreveals functional connectivity changes be-tween normaltumoral prostates BMC Sys-tems Biology 2 106 2008

5 Fujita A Sato JR Garay-Malpartida HM Sogayar MC Ferreira CE Miyano SModeling nonlinear gene regulatory net-works from time series gene expressiondata J Bioinformatics and ComputationalBiology 6 (5) 961-979 2008

6 Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Fujita A Shima-mura T Tamada Y Imoto S KinoshitaK Nakai K Miyano S A novel strategy tosearch conserved transcription factor bind-

ing sites among coexpressing genes in hu-man Genome Informatics 20 212-221 2008

7 Hirose O Yoshida R Imoto S Yama-guchi R Higuchi T Charnock-Jones DSPrint C Miyano S Statistical inference oftranscriptional module-based gene networksfrom time course gene expression profiles byusing state space models Bioinformatics 24(7) 932-942 2008

8 Hirose O Yoshida R Yamaguchi RImoto S Higuchi T Miyano S Analyzingtime course gene expression data with bio-logical and technical replicates to estimategene networks by state space models Proc2nd Asia International Conference on Mod-elling amp Simulation 940-946 2008 (AMS2008 Refereed conference)

9 Jeong E Nagasaki M Miyano S Rule-based reasoning for system dynamics in cellsystems Genome Informatics 20 25-362008

10 Kitakaze H Kanda M Nakatsuka HIkeda N Matsuno H Miyano S Predic-tion of fragile points for robustness checkingof cell systems IEICE TRANSACTIONS onInformation and Systems D J91-D (9) 2404-2417 2008

11 Knapp E-W Benson G Holzhutter H-GKanehisa M Miyano S (Eds) Genome In-formatics 20 2008

12 Kojima K Fujita A Shimamura T Imoto

127

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 13: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

S Miyano S Estimation of nonlinear generegulatory networks via L1 regularizedNVAR from time series gene expressiondata Genome Informatics 20 37-51 2008

13 Kojima K Nagasaki M Miyano S Fastgrid layout algorithm for biological net-works with sweep calculation Bioinformat-ics 24 (12) 1426-1432 2008

14 Mito N Ikegami Y Matsuno H MiyanoS Inouye S Simulation analysis for the ef-fect of light-dark cycle on the entrainment incircadian rhythm Genome Informatics 21212-223 2008

15 Nagasaki M Saito A Chen L Jeong EMiyano S Systematic reconstruction ofTRANSPATH data into Cell System MarkupLanguage BMC Systems Biology 2 532008

16 Niida A Smith AD Imoto S TsutsumiS Aburatani H Zhang MQ Akiyama TIntegrative bioinformatics analysis of tran-scriptional regulatory programs in breastcancer cells BMC Bioinformatics 9 4042008

17 Numata K Yoshida R Nagasaki M

Saito S Imoto S Miyano S ExonMinerWeb service for analysis of GeneChip exonarray data BMC Bioinformatics 9 494 2008

18 Numata K Imoto S Miyano S Partialorder-based Bayesian network learning algo-rithm for estimating gene networks ProcIEEE 8th International Symposium on Bioin-formatics amp Bioengineering IEEE ComputerSociety 357-360 2008 (BIBM 2008 Refereedconference)

19 Perrier E Imoto S Miyano S Finding op-timal Bayesian network given a super-structure J Machine Learning Research 92251-2286 2008

20 Yamaguchi R Imoto S Yamauchi M Na-gasaki M Yoshida R Shimamura THatanaka Y Ueno K Higuchi T GotohN Miyano S Predicting differences in generegulatory systems by state space modelsGenome Informatics 21 101-113 2008

21 Yoshida R Nagasaki M Yamaguchi RImoto S Miyano S Higuchi T Bayesianlearning of biological pathways on genomicdata assimilation Bioinformatics 24(22)2592-2601 2008

128

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 14: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

The major goal of our group is to identify genes of medical importance and to de-velop new diagnostic and therapeutic tools We have been attempting to isolategenes involving in carcinogenesis and also those causing or predisposing to vari-ous diseases as well as those related to drug efficacies and adverse reactions Bymeans of technologies developed through the genome project including a high-resolution SNP map a large-scale DNA sequencing and the cDNA microarraymethod we have isolated a number of biologically andor medically importantgenes and are developing novel diagnostic and therapeutic tools

1 Genes playing significant roles in humancancer

Toyomasa Katagiri Yataro Daigo HidewakiNakagawa Hitoshi Zembutsu Koichi MatsudaRyuji Hamamoto Sachiko Dobashi TomomiUeki Chikako Fukukawa Eiji Hirota Meng-Lay Lin Jae-Hyun Park Yosuke Harada Sa-toshi Nagayama Toshihiko Nishidate ArataShimo Masahiko Ajiro Jung-Won Kim Tat-suya Kato Daizaburo Hirata Koji Ueda At-sushi Takano Nobuhisa Ishikawa Koji Taka-hashi Takumi Yamabuki Nagato SatoNguyen Minh-Hue Ryohei Nishino JunkichiKoinuma Daiki Miki Ken Masuda MasatoAragaki Dragomira Nikolaeva Nikolova Sa-toko Uno Yoichiro Kato Kenji Tamura KotoeKashiwaya Masayo Hosokawa Shingo AshidaSu-Youn Chung Motohide Uemura Lianhua

Piao Chizu Tanikawa Motoko Unoki Masa-nori Yoshimatsu Shinya Hayami and YusukeNakamura

(1) Lung cancer

DLX5 (distal-less homeobox 5)

We found that distal-less homeobox 5 (DLX5)gene a member of the human distal-less ho-meobox transcriptional factor family was over-expressed in the great majority of lung cancersNorthern blot and immunohistochemical analy-ses detected expression of DLX5 only in pla-centa among 23 normal tissues examined Im-munohistochemical analysis showed that posi-tive immunostaining of DLX5 was correlatedwith tumor size (pT classification P=00053)and poorer prognosis of non-small cell lung can-

Human Genome Center

Laboratory of Molecular MedicineLaboratory of Genome Technologyゲノムシークエンス解析分野シークエンス技術開発分野

Professor Yusuke Nakamura MD PhDAssociate Professor Toyomasa Katagiri PhDAssociate Professor Yataro Daigo MD PhDAssistant Professor Ryuji Hamamoto PhDAssistant Professor Koichi Matsuda MD PhDAssistant Professor Hitoshi Zembutsu MD PhD

教 授 医学博士 中 村 祐 輔准教授 医学博士 片 桐 豊 雅准教授 医学博士 醍 醐 弥太郎助 教 理学博士 浜 本 隆 二助 教 医学博士 松 田 浩 一助 教 医学博士 前 佛 均

129

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 15: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

cer patients (P=00045) It was also shown to bean independent prognostic factor (P=00415)Treatment of lung cancer cells with small inter-fering RNAs for DLX5 effectively knocked downits expression and suppressed cell growth Thesedata implied that DLX5 is useful as a target forthe development of anticancer drugs and cancervaccines as well as for a prognostic biomarker inclinic

ECT2 (epithelial cell transforming sequence2)

We screened for genes that were frequentlyoverexpressed in the tumors through gene ex-pression profile analyses of 101 lung cancersand 19 esophageal squamous cell carcinomas(ESCC) by cDNA microarray consisting of27648 genes or expressed sequence tags In thisprocess we identified epithelial cell transform-ing sequence 2 (ECT2) as a candidate Northernblot and immunohistochemical analyses de-tected expression of ECT2 only in testis among23 normal tissues Immunohistochemical stain-ing showed that a high level of ECT2 expressionwas associated with poor prognosis for patientswith NSCLC (P=00004) as well as ESCC (P=00088) Multivariate analysis indicated it to bean independent prognostic factor for NSCLC (P=00005) Knockdown of ECT2 expression bysmall interfering RNAs effectively suppressedlung and esophageal cancer cell growth In ad-dition induction of exogenous expression ofECT2 in mammalian cells promoted cellular in-vasive activity ECT2 cancer-testis antigen islikely to be a prognostic biomarker in clinic anda potential therapeutic target for the develop-ment of anticancer drugs and cancer vaccinesfor lung and esophageal cancers

(2) Breast Cancer

DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein)

To investigate the detailed molecular mecha-nism of mammary carcinogenesis and discovernovel therapeutic targets we previously ana-lysed gene expression profiles of breast cancersWe here report characterization of a significantrole of DTLRAMP (denticlelessRA-regulatednuclear matrix associated protein) in mammarycarcinogenesis Semiquantitative RT-PCR andnorthern blot analyses confirmed upregulationof DTLRAMP in the majority of breast cancercases and all of breast cancer cell lines exam-ined Immunocytochemical and western blotanalyses using anti-DTLRAMP polyclonal anti-body revealed cell-cycle-dependent localization

of endogenous DTLRAMP protein in breastcancer cells nuclear localization was observed incells at interphase and the protein was concen-trated at the contractile ring in cytokinesis proc-ess The expression level of DTLRAMP proteinbecame highest at G(1)S phases whereas itsphosphorylation level was enhanced during mi-totic phase Treatment of breast cancer cells T47D and HBC4 with small-interfering RNAsagainst DTLRAMP effectively suppressed itsexpression and caused accumulation of G(2)Mcells resulting in growth inhibition of cancercells We further demonstrate the in vitro phos-phorylation of DTLRAMP through an interac-tion with the mitotic kinase Aurora kinase-B(AURKB) Interestingly depletion of AURKB ex-pression with siRNA in breast cancer cells re-duced the phosphorylation of DTLRAMP anddecreased the stability of DTLRAMP proteinThese findings imply important roles of DTLRAMP in growth of breast cancer cells and sug-gest that DTLRAMP might be a promising mo-lecular target for treatment of breast cancer

(3) Renal cancer

TMEM22 (transmembrane protein 22)

In order to clarify the molecular mechanisminvolved in renal carcinogenesis and to identifymolecular targets for development of noveltreatments of renal cell carcinoma (RCC) wepreviously analyzed genome-wide gene expres-sion profiles of clear-cell types of RCC by cDNAmicroarray Among the transcativated genes weherein focused on functional significance ofTMEM22 (transmembrane protein 22) a trans-membrane protein in cell growth of RCCNorthern blot and semi-quantitative RT-PCRanalyses confirmed up-regulation of TMEM22 ina great majority of RCC clinical samples and celllines examined Immunocytochemical analysisvalidated its localization at the plasma mem-brane We found an interaction between TMEM22 and RAB37 (Ras-related protein Rab-37)which was also up-regulated in RCC cells Inter-estingly knockdown of either of TMEM22 orRAB37 expression by specific siRNA caused sig-nificant reduction of cancer cell growth Our re-sults imply that the TMEM22RAB37 complex islikely to play a crucial role in growth of RCCand that inhibition of the TMEM22RAB37 ex-pression or their interaction should be noveltherapeutic targets for RCC

(4) Synovial sarcoma

FZD10 (Frizzled homologue 10)

130

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 16: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

We previously reported that Frizzled homo-logue 10 (FZD10) a member of the Wnt signalreceptor family was highly and specificallyupregulated in synovial sarcoma and playedcritical roles in its cell survival and growth Weinvestigated a possible molecular mechanism ofthe FZD10 signaling in synovial sarcoma cellsWe found a significant enhancement of phos-phorylation of the Dishevelled (Dvl)2Dvl3complex as well as activation of the Rac1-JNKcascade in synovial sarcoma cells in which FZD10 was overexpressed Activation of the FZD10-Dvls-Rac1 pathway induced lamellipodia forma-tion and enhanced anchorage-independent cellgrowth FZD10 overexpression also caused thedestruction of the actin cytoskeleton structureprobably through the downregulation of theRhoA activity Our results have strongly im-plied that FZD10 transactivation causes the acti-vation of the non-canonical Dvl-Rac1-JNK path-way and plays critical roles in the develop-mentprogression of synovial sarcomas

(5) Pancreatic cancer

CST6 (Cystatin 6)

Pancreatic ductal adenocarcinoma (PDAC)shows the worst mortality among the commonmalignancies and development of novel thera-pies for PDAC through identification of goodmolecular targets is an urgent issue Amongdozens of over-expressing genes identifiedthrough our gene-expression profile analysis ofPDAC cells we here report CST6 (Cystatin 6 orEM) as a candidate of molecular targets forPDAC treatment Reverse transcriptase-polymerase chain reaction (RT-PCR) and immu-nohistochemical analysis confirmed over-expression of CST6 in PDAC cells but no orlimited expression of CST6 was observed in nor-mal pancreas and other vital organs Knock-down of endogenous CST6 expression by smallinterfering RNA attenuated PDAC cell growthsuggesting its essential role in maintaining vi-ability of PDAC cells Concordantly constitutiveexpression of CST6 in CST6-null cells promotedtheir growth in vitro and in vivo Furthermorethe addition of mature recombinant CST6 in cul-ture medium also promoted cell proliferation ina dose-dependent manner whereas recombinantCST6 lacking its proteinase-inhibitor domainand its non-glycosylated form did not Over-expression of CST6 inhibited the intracellular ac-tivity of cathepsin B which is one of the puta-tive substrates of CST6 proteinase inhibitor andcan intracellularly function as a pro-apoptoticfactor These findings imply that CST6 is likelyto involve in the proliferation and survival of

pancreatic cancer probably through its protein-ase inhibitory activity and it is a promising mo-lecular target for development of new therapeu-tic strategies for PDAC

C2orf18 (ANTBP)

Through our genome-wide gene expressionprofiles of microdissected PDAC cells we hereidentified a novel gene C2orf18 as a moleculartarget for PDAC treatment Transcriptional andimmunohistochemical analysis validated itsoverexpression in PDAC cells and limited ex-pression in normal adult organs Knockdown ofC2orf18 by small-interfering RNA in PDAC celllines resulted in induction of apoptosis and sup-pression of cancer cell growth suggesting its es-sential role in maintaining viability of PDACcells We showed that C2orf18 was localized inthe mitochondria and it could interact with ade-nine nucleotide translocase 2 (ANT2) which isinvolved in maintenance of the mitochondrialmembrane potential and energy homeostasisand was indicated some roles in apoptosisThese findings implicated that C2orf18 termedANT2-binding protein (ANT2BP) might serveas a candidate molecular target for pancreaticcancer therapy

(6) Prostate cancer

STC2 (stanniocalcin 2)

Prostate cancer is usually androgen-dependentand responds well to androgen ablation therapybased on castration However at a certain stagesome prostate cancers eventually acquire acastration-resistant phenotype where they pro-gress aggressively and show very poor responseto any anticancer therapies To characterize themolecular features of these clinical castration-resistant prostate cancers we previously ana-lyzed gene expression profiles by genome-widecDNA microarrays combined with microdissec-tion and found dozens of trans-activated genesin clinical castration-resistant prostate cancersAmong them we report the identification of anew biomarker stanniocalcin 2 (STC2) as anoverexpressed gene in castration-resistant pros-tate cancer cells Real-time polymerase chain re-action and immunohistochemical analysis con-firmed overexpression of STC2 a 302-amino-acid glycoprotein hormone specifically in cas-trationresistant prostate cancer cells and aggres-sive castration-naiumlve prostate cancers with highGleason scores (8-10) The gene was not ex-pressed in normal prostate nor in most indolentcastration-naiumlve prostate cancers Knockdown ofSTC2 expression by short interfering RNA in a

131

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 17: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

prostate cancer cell line resulted in drastic at-tenuation of prostate cancer cell growth Concor-dantly STC2 overexpression in a prostate cancercell line promoted prostate cancer cell growthindicating its oncogenic property These findingssuggest that STC2 could be involved in aggres-sive phenotyping of prostate cancers includingcastration-resistant prostate cancers and that itshould be a potential molecular target for devel-opment of new therapeutics and a diagnosticbiomarker for aggressive prostate cancers

(7) Thyroid cancer

In order to clarify the molecular mechanisminvolved in thyroid carcinogenesis and to iden-tify candidate molecular targets for diagnosisand treatment we analyzed genome-wide geneexpression profiles of 18 papillary thyroid carci-nomas with a microarray representing 38500genes in combination with laser microbeam mi-crodissection We identified 243 transcripts thatwere commonly up-regulated and 138 tran-scripts that were down-regulated in thyroid car-cinoma Among these 243 transcripts identifiedonly 71 transcripts were reported as up-regulated genes in previous microarray studiesin which bulk cancer tissues and normal thyroidtissues were used for the analysis We furtherselected genes that were overexpressed verycommonly in thyroid carcinoma though werenot expressed in the normal human tissues ex-amined Among them we focused on the regu-lator of G-protein signaling 4 (RGS4) andknocked-down its expression in thyroid cancercells by small-interfering RNA The effectivedown-regulation of its expression levels in thy-roid cancer cells significantly attenuated viabil-ity of thyroid cancer cells indicating the signifi-cant role of RGS4 in thyroid carcinogenesis Ourdata should be helpful for a better understand-ing of the tumorigenesis of thyroid cancer andcould contribute to the development of diagnos-tic tumor markers and molecular-targeting ther-apy for patients with thyroid cancer

(8) Ovarian cancer

We aimed to clarify the molecular mecha-nisms involved in ovarian carcinogenesis and toidentify candidate molecular targets for its diag-nosis and treatment The genome-wide gene ex-pression profiles of 22 epithelial ovarian carcino-mas were analyzed with a microarray represent-ing 38500 genes in combination with laser mi-crobeam microdissection A total of 273 com-monly up-regulated transcripts and 387 down-regulated transcripts were identified in the ovar-ian carcinoma samples Of the 273 up-regulated

transcripts only 87 (319) were previously re-ported as upregulated in microarray studies us-ing bulk cancer tissues and normal ovarian tis-sues for analysis CHMP4C (chromatinmodify-ing protein 4C) was frequently overexpressed inovarian carcinoma tissue but not expressed inthe normal human tissues used as a control Ourdata should contribute to an improved under-standing of tumorigenesis in ovarian cancer andaid in the development of diagnostic tumormarkers and molecular-targeting therapy for pa-tients with the disease

(9) Proteomics

To screen for glycoproteins showing aberrantsialylation patterns in sera of cancer patientsand apply such information for biomarker iden-tification we performed SELDI-TOF MS analysiscoupled with lectin-coupled ProteinChip arrays(Jacalin or SNA) using sera obtained from lungcancer patients and control individuals Our ap-proach consisted of three processes (1) removalof 14 abundant proteins in serum (2) enrich-ment of glycoproteins with lectin-coupled Prote-inChip arrays and (3) SELDI-TOF MS analysiswith acidic glycoprotein-compatible matrix Weidentified 41 protein peaks showing significantdifferences (P<005) in the peak levels betweenthe cancer and control groups using the Jacalin-and SNA- ProteinChips Among them we iden-tified loss of Neu5Ac (α2 6) GalGalNAcstructure in apolipoprotein C-III (apoC-III) incancer patients through subsequent MALDI-QIT-TOF MSMS Furthermore subsequent vali-dation experiments using an additional set of 60lung adenocarcinoma patients and 30 normalcontrols demonstrated that there is a higher fre-quency of serum apoC-III with loss of α2 6-linkage Neu5Ac residues in lung cancer patientscompared to controls Our results have demon-strated that lectin-coupled ProteinChip technol-ogy allows the high-throughput and specific rec-ognition of cancer-associated aberrant glycosyla-tions and implied a possibility of its applicabil-ity to studies on other diseases

(10) Chemosensitivity

Breast Cancer

Neoadjuvant chemotherapy with docetaxel foradvanced breast cancer can improve the radical-ity for a subset of patients but some patientssuffer from severe adverse drug reactions with-out any benefit To establish a method for pre-dicting responses to docetaxel we analyzedgene expression profiles of biopsy materialsfrom 29 advanced breast cancers using a cDNA

132

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 18: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

microarray consisting of 36864 genes or ESTsafter enrichment of cancer cell population by la-ser microbeam microdissection Analyzing eightPR (partial response) patients and twelve pa-tients with SD (stable disease) or PD (progres-sive disease) response we identified dozens ofgenes that were expressed differently betweenthe lsquoresponder (PR)rsquo and lsquonon-responder (SD orPD)rsquo groups We further selected the nine lsquopre-dictiversquo genes showing the most significant dif-ferences and established a numerical predictionscoring system that clearly separated the re-sponder group from the non-responder groupThis system accurately predicted the drug re-sponses of all of nine additional test cases thatwere reserved from the original 29 cases More-over we developed a quantitative PCR-basedprediction system that could be feasible for rou-tine clinical use Our results suggest that thesensitivity of an advanced breast cancer to theneoadjuvant chemotherapy with docetaxel couldbe predicted by expression patterns in this set ofgenes

2 Pharmacogenomics

(1) Warfarin maintenance-dose requirements

The International Warfarin PharmacogeneticsConsortium

Genetic variability among patients plays animportant role in determining the dose of war-farin that should be used when oral anticoagula-tion is initiated but practical methods of usinggenetic information have not been evaluated ina diverse and large population We developedand used an algorithm for estimating the appro-priate warfarin dose that is based on both clini-cal and genetic data from a broad populationbase Clinical and genetic data from 4043 pa-tients were used to create a dose algorithm thatwas based on clinical variables only and an al-gorithm in which genetic information wasadded to the clinical variables In a validationcohort of 1009 subjects we evaluated the poten-tial clinical value of each algorithm by calculat-ing the percentage of patients whose predicteddose of warfarin was within 20 of the actualstable therapeutic dose we also evaluated otherclinically relevant indicators In the validationcohort the pharmacogenetic algorithm accu-rately identified larger proportions of patientswho required 21 mg of warfarin or less perweek and of those who required 49 mg or moreper week to achieve the target international nor-malized ratio than did the clinical algorithm(494 vs 333 P<0001 among patients re-quiring<or=21 mg per week and 248 vs

72 P<0001 among those requiring>or=49mg per week) The use of a pharmacogenetic al-gorithm for estimating the appropriate initialdose of warfarin produces recommendationsthat are significantly closer to the required sta-ble therapeutic dose than those derived from aclinical algorithm or a fixed-dose approach Thegreatest benefits were observed in the 462 ofthe population that required 21 mg or less ofwarfarin per week or 49 mg or more per weekfor therapeutic anticoagulation

(2) Genotype of CYP2D6 and selection of ad-juvant hormonal therapy with tamoxifenfor breast cancer patients

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Mitsunori Sasa2 Yoshimi Bando3 IkukoSumitomo2 Naoya Hosono4 Michiaki Kubo4Yusuke Nakamura15 and Hitoshi Zembutsu51Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Department ofSurgery Tokushima Breast Care Clinic 3De-partment of Molecular and Environmental Pa-thology Institute of Health Biosciences TheUniversity of Tokushima Graduate School4Laboratory for genotyping SNP ResearchCenter The Institute of Physical and ChemicalResearch (RIKEN) 5Laboratory of MolecularMedicine Human Genome Center Institute ofMedical Science The University of Tokyo

The clinical outcomes of breast cancer patientstreated with tamoxifen may be influenced bythe activity of cytochrome P450 2D6 (CYP2D6)enzyme because tamixifen is metabolized byCYP2D6 to its active forms of antiestrogenic me-tabolite 4-hydroxytamoxifen and endoxifen Weinvestigated the predictive value of theCYP2D610 allele which decreased CYP2D6 ac-tivity for clinical outcomes of patients that re-ceived adjuvant tamoxifen monotherapy aftersurgical operation on breast cancer Among 67patients examined those homozygous for theCYP2D610 alleles revealed a significantlyhigher incidence of recurrence within 10 yearsafter the operation (P=00057 odds ratio 166395 confidence interval 175-15812) comparedwith those homozygous for the wild-typeCYP2D61 alleles The elevated risk of recur-rence seemed to be dependent on the number ofCYP2D610 alleles (P=00031 for trend) Coxproportional hazard analysis demonstrated thatthe CYP2D6 genotype and tumor size were in-dependent factors affecting recurrence-free sur-vival Patients with the CYP2D61010 geno-type showed a significantly shorter recurrence-free survival period (P=0036 adjusted hazard

133

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 19: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

ratio 1004 95 confidence interval 117-8627)compared to patients with CYP2D611 afteradjustment of other prognosis factors The pre-sent study suggests that the CYP2D6 genotypeshould be considered when selecting adjuvanthormonal therapy for breast cancer patients

(3) Genotype of drug metabolismtransportergenes and Docetaxel-induced leukopenianeutropenia

Authors Kazuma Kiyotani1 Taisei Mushi-roda1 Michiaki Kubo2 Hitoshi Zembutsu3Yuichi Sugiyama4 and Yusuke Nakamura131Laboratory for Pharmacogenetics SNP Re-search Center The Institute of Physical andChemical Research (RIKEN) 2Laboratory forgenotyping SNP Research Center The Insti-tute of Physical and Chemical Research(RIKEN) 3Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 4Departmentof Molecular Pharmacokinetics GraduateSchool of Pharmaceutical Sciences The Uni-versity of Tokyo

Despite long-term clinical experience with do-cetaxel unpredictable severe adverse reactionsremain an important determinant for limitingthe use of the drug To identify a genetic factor(s) determining the risk of docetaxel-inducedleukopenianeutropenia we selected subjectswho received docetaxel chemotherapy fromsamples recruited at BioBank Japan and con-ducted a case-control association study Wegenotyped 84 patients 28 patients with grade 3or 4 leukopenianeutropenia and 56 with notoxicity (patients with grade 1 or 2 were ex-cluded) for a total of 79 single nucleotide poly-morphisms (SNPs) in seven genes possibly in-volved in the metabolism or transport of thisdrug CYP3A4 CYP3A5 ABCB1 ABCC2 SLCO1B3 NR1I2 and NR1I3 Since one SNP in ABCB1 four SNPs in ABCC2 four SNPs in SLCO1B3 and one SNP in NR1I2 showed a possible asso-ciation with the grade 3 leukopenianeutropenia(P -value of<005) we further examined these10 SNPs using 29 additionally obtained patients11 patients with grade 34 leukopenianeutro-penia and 18 with no toxicity The combinedanalysis indicated a significant association of rs12762549 in ABCC2 (P=000022) and rs11045585in SLCO1B3 (P=000017) with docetaxel-induced leukopenianeutropenia When patientswere classified into three groups by the scoringsystem based on the genotypes of these twoSNPs patients with a score of 1 or 2 wereshown to have a significantly higher risk ofdocetaxel-induced leukopenianeutropenia as

compared to those with a score of 0 (P=00000057 odds ratio [OR] 700 95 CI [confi-dence interval] 295-1659) This prediction sys-tem correctly classified 692 of severe leuko-penia neutropenia and 757 of non-leukopenianeutropenia into the respective cate-gories indicating that SNPs in ABCC2 andSLCO1B3 may predict the risk of leukopenianeutropenia induced by docetaxel chemother-apy

(4) HLA genotype and Nevirapine (NVP)-induced skin rash

Authors Soranun Chantarangsu12 TaiseiMushiroda1 Surakameth Mahasirimongkol5Sasisopin Kiertiburanakul3 Somnuek Sungkan-uparph3 Weerawat Manosuthi6 WoraphotTantisiriwat7 Angkana Charoenyingwattana4Thanyachai Sura3 Wasun Chantratita2 andYusuke Nakamura1 1Research Group forPharmacogenomics RIKEN Center forGenomic Medicine Departments of 2Pathology3Medicine Faculty of Medicine 4Department ofPharmacy Ramathibodi Hospital MahidolUniversity Bangkok Thailand 5Center for In-ternational Cooperation Department of Medi-cal Sciences 6Bamrasnaradura Infectious Dis-eases Institute Ministry of Public Health 7De-partment of Preventive Medicine Faculty ofMedicine Srinakharinwirot University Nak-ornnayok Thailand

We investigated a possible involvement of dif-ferences in human leukocyte antigens (HLA) inthe risk of nevirapine (NVP)-induced skin rashamong HIV-infected patients by a step-wisecase-control association study We first geno-typed by a sequence-based HLA typing methodfor the HLA-A HLA-B HLA-C HLA-DRB1HLA-DQB1 and HLA-DPB1 in the first set ofsamples consisted of 80 samples from patientswith NVP-induced skin rash and 80 samplesfrom NVP-tolerant patients Subsequently weverified HLA alleles that showed a possible as-sociation in the first screening using an addi-tional set of samples consisting of 67 cases withNVP-induced skin rash and 105 controls AnHLA-B 3505 allele revealed a significant associa-tion with NVP-induced skin rash in the first andsecond screenings In the combined data set theHLA-B 3505 allele was observed in 175 of thepatients with NVP-induced skin rash comparedwith only 11 observed in NVP-tolerant pa-tients [odds ratio (OR)=1896 95 confidenceinterval (CI)=487-7344 Pc=46times10] and 07in general Thai population (OR=2987 95 CI=504-17586 Pc=26times10) The logistic regres-sion analysis also indicated HLA-B 3505 to be

134

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 20: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

significantly associated with skin rash with ORof 4915 (95 CI=645-37441 P=000017) Wesuggest that strong association between theHLA-B 3505 and NVP-induced skin rash pro-vides a novel insight into the pathogenesis ofdrug-induced rash in the HIV-infected popula-tion On account of its high specificity (989)in identifying NVP-induced rash it is possibleto utilize the HLA-B 3505 as a marker to avoida subset of NVP-induced rash at least in Thaipopulation

3 Common diseases

(1) Chronic hepatitis B

Authors Yoichiro Kamatani12 Sukanya Wat-tanapokayakit3 Hidenori Ochi45 TakahisaKawaguchi4 Atsushi Takahashi4 NaoyaHosono4 Michiaki Kubo4 Tatsuhiko Tsunoda4Naoyuki Kamatani4 Hiromitsu Kumada6Aekkachai Puseenam7 Thanyachai Sura7Yataro Daigo2 Kazuaki Chayama45 WasunChantratita8 Yusuke Nakamura14 and KoichiMatsuda1 1Laboratory of Molecular MedicineHuman Genome Center Institute of MedicalScience The University of Tokyo 2Departmentof Medical Genome Sciences Graduate Schoolof Frontier Sciences The Universtiy of Tokyo3Center for International Cooperation Depart-ment of Medical Sciences Ministry of PublicHealth Thailand 4Center for Genomic Medi-cine RIKEN 5Department of Medicine andMolecular Science Division of Frontier Medi-cal Science Programs for Biomedical ResearchGraduate School of Biomedical Sciences Hiro-shima University 6Department of HepatologyToranomon Hospital 7Department of MedicineFaculty of Medicine and 8Virology and Molecu-lar Microbiology Unit Department of Pathol-ogy Faculty of Medicine Ramathidi HospitalMahidol University Thailand

Chronic hepatitis B is a serious infectious liverdisease that often progresses to liver cirrhosisand hepatocellular carcinoma however clinicaloutcomes after viral exposure enormously varyamong individuals Through a two-stepgenome-wide association study using 786 Japa-nese chronic hepatitis B patients and 2201 con-trols here we identified a significant associationof chronic hepatitis B with 11 SNPs in a regionincluding HLA-DPA1 and HLA-DPB1 genesThese associations were validated in two Japa-nese and one Thai cohorts consisting of 1300cases and 2100 controls (combined P=634times10-39 and 231times10-38 OR=057 and 056 respec-tively) Subsequent analyses revealed diseasesusceptible haplotypes (HLA-DPA10202-DPB1

0501 and HLA-DPA10202-DPB10301 OR=145 and 231 respectively) and protectivehaplotypes (HLA-DPA10103-DPB10402 andHLA-DPA10103-DPB10401 OR=052 and057 respectively) Our findings demonstratedthat genetic variations in the HLA-DP locus arestrongly associated with the risk of persistent in-fection of hepatitis B virus

(2) Idiopathic pulmonary fibrosis (IPF)

Authors Taisei Mushiroda1 Sukanya Wattana-pokayakit2 Atsushi Takahashi3 ToshihiroNukiwa4 Shoji Kudoh5 Takashi Ogura6 Hi-royuki Taniguchi7 Michiaki Kubo8 NaoyukiKamatani3 Yusuke Nakamura19 and the Pir-fenidone Clinical Study Group4 1Laboratoryfor Pharmacogenetics Institute of Physical andChemical Research (RIKEN) 2Laboratory forCardiovascular Diseases Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis Institute of Physical andChemical Research (RIKEN) 4Department ofRespiratory Oncology and Molecular MedicineInstitute of Development Aging and CancerTohoku University 5Fourth Department of In-ternal Medicine Nippon Medical School 6De-partment of Respiratory Medicine KanagawaCardiovascular and Respiratory Center 7De-partment of Respiratory Medicine and AllergyTosei General Hospital Aichi 8Laboratory forgenotyping Institute of Physical and ChemicalResearch (RIKEN) 9Laboratory of MolecularMedicine Institute of Medical Science Univer-sity of Tokyo

In order to identify a gene (s) susceptible toidiopathic pulmonary fibrosis (IPF) we con-ducted a genome-wide association (GWA) studyby genotyping 159 patients with IPF and 934controls for 214508 tag single-nucleotide poly-morphisms (SNPs) We further evaluated se-lected SNPs in a replication sample set (83 casesand 535 controls) and found a significant asso-ciation of an SNP in intron 2 of the TERT gene(rs2736100) which encodes a reverse transcrip-tase that is a component of a telomerase withIPF a combination of two data sets revealed a pvalue of 29times10 (-8) (GWA 28times10 (-6) replica-tion 36times10 (-3)) Considering previous reportsindicating that rare mutations of TERT arefound in patients with familial IPF we suggestthat the common genetic variation within TERTmay contribute to the risk of sporadic IFP in theJapanese population

(3) Schizophrenia

Authors Elitza T Betcheva1 Taisei Mushi-

135

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 21: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

roda2 Atsushi Takahashi3 Michiaki Kubo4Sena K Karachanak5 Irina T Zaharieva6 Ra-doslava V Vazharova5 Ivanka I Dimova5 Vi-hra K Milanova6 Todor Tolev7 George Kirov8Michael J Owen8 Michael C OrsquoDonovan8Naoyuki Kamatani3 Yusuke Nakamura9 andDraga I Toncheva5 1Laboratory for Cardiovas-cular Diseases SNP Research Center The In-stitute of Physical and Chemical Research(RIKEN) 2Laboratory for PharmacogeneticsSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 3Laboratoryof Statistical Analysis SNP Research CenterThe Institute of Physical and Chemical Re-search (RIKEN) 4Laboratory for GenotypingSNP Research Center The Institute of Physicaland Chemical Research (RIKEN) 5Departmentof Medical Genetics Medical Faculty MedicalUniversity Sofia Bulgaria 6Department ofPsychiatry Aleksandrovska Hospital MedicalUniversity Sofia Bulgaria 7Department ofPsychiatry Dr Georgi Kisiov Hospital Rad-nevo Bulgaria 8Department of PsychologicalMedicine Cardiff University School of Medi-cine Henry Wellcome Building Heath ParkCardiff UK 9Laboratory of Molecular Medi-cine Human Genome Center Institute of

Medical Science The University of Tokyo

The development of molecular psychiatry inthe last few decades identified a number of can-didate genes that could be associated withschizophrenia A great number of studies oftenresult with controversial and non-conclusiveoutputs However it was determined that eachof the implicated candidates would independ-ently have a minor effect on the susceptibility tothat disease Herein we report results from ourreplication study for association using 255 Bul-garian patients with schizophrenia and schizoaf-fective disorder and 556 Bulgarian healthy con-trols We have selected from the literatures 202single nucleotide polymorphisms (SNPs) in 59candidate genes which previously were impli-cated in disease susceptibility and we havegenotyped them Of the 183 SNPs successfullygenotyped only 1 SNP rs6277 (C957T) in theDRD2 gene (P=00010 odds ratio=176) wasconsidered to be significantly associated withschizophrenia after the replication study usingindependent sample sets Our findings supportone of the most widely considered hypothesesfor schizophrenia etiology the dopaminergic hy-pothesis

Publications

1 Hosono N Kubo M Tsuchiya Y SatoH Kitamoto T Saito S Ohnishi Y andNakamura Y Multiplex PCR-based real-time Invader assay (mPCR-RETINA) anovel SNP-based method for detecting alle-lic asymmetries within copy number vari-ation regions Hum Mutation 29 182-1892008

2 Onouchi Y Gunji T Burns JC ShimizuC Newburger JW Yashiro M Naka-mura Yo Yanagawa H Wakui KFukushima Y Kishi F Hamamoto KTerai M Sato Y Ouchi K Saji T NariaiA Kaburagi Y Yoshikawa T Suzuki KTanaka T Nagai T Cho H Fujino ASekine A Nakamichi R Tsunoda TKawasaki T Nakamura Yu and Hata AA functional polymorphism in ITPKC is as-sociated with Kawasaki disease susceptibil-ity and formation of coronary artery aneu-rysms Nat Genet 40 35-42 2008

3 Silva FP Hamamoto R Kunizaki MTsuge M Nakamura Y and Furukawa YEnhanced methyltransferase activity ofSMYD3 by the cleavage of its N-terminal re-gion in human cancer cells Oncogene 272686-2692 2008

4 Obama K Satoh S Hamamoto R Sakai

Y Nakamura Y and Furukawa Y En-hanced expression of RAD51AP1 is involvedin the growth of intrahepatic cholangiocarci-noma cells Clin Cancer Res 14 1333-13392008

5 M Kato F Miya Y Kanemura T TanakaY Nakamura and T Tsunoda Recombina-tion rates of genes expressed in human tis-sues Hum Mol Genet 17 577-586 2008

6 Leung AAC Wong VCL Yang LCChan PL Daigo Y Nakamura Y Qi RZ Miller L Liu E T-K Wang LD J-LS Law Tsao W and Lung ML Frequentdecreased expression of candidate tumorsuppressor gene DEC1 and its anchorage-independent growth properties and impacton global gene expression in esophageal car-cinoma Int J Cancer 122 587-594 2008

7 Shimo A Tanikawa C Nishidate T Mat-suda K Lin M-L Park J-H Ohta THirata K Fukuda M Nakamura Y andKatagiri T Involvement of KIF2CMCAKoverexpression in mammary carcinogenesisCancer Sci 99 62-70 2008

8 Uemura M Tamura K Chung S HonmaS Okuyama A Nakamura Y and Naka-gawa HA novel 5-steroid reductase (SRD5A3 type-3) is overexpressed in hormone-

136

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 22: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

refractory prostate cancer Cancer Sci 99 81-86 2008

9 Kamatani Y Matsuda K Ohishi T Oht-subo S Yamazaki K Iida A Hosono NKubo M Yumura W Nitta K KatagiriT Kawaguchi Y Kamatani N and Naka-mura Y Identification of a significant asso-ciation of an SNP in TNXB with SLE inJapanese population J Hum Genet 53 64-73 2008

10 Fukukawa C Hanaoka H Nagayama STsunoda T Toguchida J Endo K Naka-mura Y and Katagiri T Radioimmunother-apy of human synovial sarcoma using amonoclonal antibody against FZD10 CancerSci 99 432-440 2008

11 Brunet J Pfaff AW Abidi A Unoki MNakamura Y Guinard M Klein J-PCandolfi E and Mousli M Toxoplasmagondii exploits UHRF1 and induces host cellcycle arrest at G2 to enable its proliferationCell Microbiol 10 908-920 2008

12 Kato N Miyata T Tabara Y Katsuya TYanai K Hanada H Kamide K NakuraJ Kohara K Takeuchi F Mano H Yasu-nami M Kimura A Kita Y Ueshima HNakayama T Soma M Hata A FujiokaA Kawano Y Nakao K Sekine AYoshida T Nakamura Y Saruta T Ogi-hara T Sugano S Miki T and TomoikeH High-Density Association Study andNomination of Susceptibility Genes for Hy-pertension in the Japanese National ProjectHum Mol Genet 17 617-627 2008

13 Oishi T Iida A Otsubo S Kamatani YUsami M Takei T Uchida K TsuchiyaK Saito S Ohnishi Y Tokunaga KNitta K Kawaguchi Y Kamatani N Ko-chi Y Shimane K Yamamoto K Naka-mura Y Yumura W and Matsuda KAfunctional SNP in the NKX25-binding siteof ITPR3 promoter is associated with sus-ceptibility to Systemic Lupus Erythematosusin Japanese population J Hum Genet 53151-162 2008

14 Daigo Y and Nakamura Y From cancergenomics to thoracic oncology discovery ofnew biomarkers and therapeutic targets forlung and esophageal carcinoma (ReviewArticle) General Thoracic and Cardiovascu-lar Surgery 56 43-53 2008

15 Kiyotani K Mushiroda T Kubo M Zem-butsu H Sugiyama Y and Nakamura YAssociation of genetic polymorphisms inSLCO1B3 and ABCC2 with docetaxel-induced leukopenia Cancer Sci 99 967-9722008

16 Kiyotani K Mushiroda T Sasa M BandoY Sumitomo I Hosono N Kubo M

Nakamura Y and Zembutsu H Impact ofCYP2D610 on recurrence-free survival inbreast cancer patients receiving adjuvant ta-moxifen therapy Cancer Sci 99 995-9992008

17 Kato T Sato N Takano A MiyamotoM Nishimura H Tsuchiya E Kondo SNakamura Y and Daigo Y Activation ofPlacenta-Specific Transcription Factor Distal-less Homeobox 5 Predicts Clinical Outcomein Primary Lung Cancer Patients Clin Can-cer Res 14 2363-2370 2008

18 Tenesa A Farrington SM Prendergast JG Porteous ME Walker M Haq N Bar-netson RA Theodoratou E CetnarskyjR Cartwright N Semple C Clark AJReid FJ Smith LA Kavoussanakis KKoessler T Pharoah PD Buch S Schaf-mayer C Tepel J Schreiber S Voumllzke HSchmidt CO Hampe J Chang-Claude JHoffmeister M Brenner H Wilkening SCanzian F Capella G Moreno V DearyIJ Starr JM Tomlinson IP Kemp ZHowarth K Carvajal-Carmona L WebbE Broderick P Vijayakrishnan J Houl-ston RS Rennert G Ballinger D RozekL Gruber SB Matsuda K Kidokoro TNakamura Y Zanke BW Greenwood CM Rangrej J Kustra R Montpetit AHudson TJ Gallinger S Campbell H andDunlop MG Genome-wide association scanidentifies a colorectal cancer susceptibilitylocus on 11q23 and replicates risk loci at 8q24 and 18q21 Nat Genet 40 631-637 2008

19 Mototani H Iida A Nakajima M Fu-ruichi T Miyamoto Y Tsunoda T SudoA Kotani A Uchida K Ozaki KTanaka Y Nakamura Y Tanaka T No-toya K and Ikegawa SA functional SNP inEDG2 increases susceptibility to knee os-teoarthritis in Japanese Hum Mol Genet17 1790-1797 2008

20 Mizukami Y Kono K Daigo Y TakanoA Tsunoda T Kawaguchi Y NakamuraY and Fujii H Detection of novel Cancer-Testis antigen-specific T-cell responses inTIL regional lymph nodes and PBL in pa-tients with esophageal squamous cell carci-noma Cancer Sci 99 1448-1454 2008

21 Mushiroda T Wattanapokayakit S Taka-hashi A Nukiwa T Kudoh S Ogura TTaniguchi H Pirfenidone Clinical StudyGroup Kubo M Kamatani N and Naka-mura YA genome-wide association studyidentifies an association of a common vari-ant in TERT with susceptibility to idiopathicpulmonary fibrosis J Med Genet 45 654-656 2008

22 Hosokawa M Kashiwaya K Furihara M

137

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 23: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

Eguchi H Ohigashi H Ishikawa O Shi-nomura Y Imai K Nakamura Y andNakagawa H Overexpression of cysteineproteinase inhibitor cystatin 6 promotes pan-creatic cancer growth Cancer Sci 99 1626-1632 2008

23 Study Group of Millennium Genome Projectfor Cancer Sakamoto H Yoshimura KSaeki N Katai H Shimoda T MatsunoY Saito D Sugimura H Tanioka FKato S Matsukura N Matsuda N Naka-mura T Hyodo I Nishina T Yasui WHirose H Hayashi M Toshiro EOhnami S Sekine A Sato Y Totsuka HAndo M Takemura R Takahashi Y Oh-daira M Aoki K Honmyo I Chiku SAoyagi K Sasaki H Ohnami S Yanagi-hara K Yoon KA Kook MC Lee YSPark SR Kim CG Choi IJ Yoshida TNakamura Y and Hirohashi S Geneticvariation in PSCA is associated with suscep-tibility to diffuse-type gastric cancer NatGenet 40 730-740 2008

24 Ueki T Nishidate T Park JH Lin MLShimo A Hirata K Nakamura Y andKatagiri T Involvement of elevated expres-sion of multiple cell-cycle regulator DTLRAMP (denticlelessRA-regulated nuclearmatrix associated protein) in the growth ofbreast cancer cells Oncogene 27 5672-56832008

25 Miyamoto Y Shi D Nakajima M OzakiK Sudo A Kotani A Uchida A TanakaT Fukui N Tsunoda T Takahashi ANakamura Y Jiang Q and Ikegawa SCommon variants in DVWA on chromo-some 3p243 are associated with susceptibil-ity to knee osteoarthritis Nat Genet 40 994-998 2008

26 Unoki H Takahashi A Kawaguchi THara K Horikoshi M Andersen G NgDP Holmkvist J Borch-Johnsen KJorgensen T Sandbaek A Lauritzen THansen T Nurbaya S Tsunoda T KuboM Babazono T Hirose H Hayashi MIwamoto Y Kashiwagi A Kaku KKawamori R Tai ES Pedersen O Ka-matani N Kadowaki T Kikkawa RNakamura Y and Maeda S SNPs inKCNQ1 are associated with susceptibility totype 2 diabetes in East Asian and Europeanpopulations Nat Genet 40 1098-1102 2008

27 Harao M Hirata S Irie A Senju SNakatsura T Komori H Ikuta Y Yok-omine K Imai K Inoue M Harada KMori T Tsunoda T Nakatsuru S DaigoY Nomori H Nakamura Y Baba H andNishimura Y HLA-A2-restricted CTL epi-topes of a novel lung cancer-associated can-

cer testis antigen cell division cycle associ-ated 1 can induce tumor-reactive CTL IntJ Cancer 123 2616-2625 2008

28 Imai K Hirata S Irie A Senju S IkutaY Yokomine K Harao M Inoue MTsunoda T Nakatsuru S Nakagawa HNakamura Y Baba H and Nishimura YIdentification of a novel tumor-associatedantigen cadherin 3P-cadherin as a possibletarget for immunotherapy of pancreatic gas-tric and colorectal cancers Clin Cancer Res14 6487-6495 2008

29 Nikolova DN Zembutsu H Sechanov TVidinov K Kee LS Ivanova R BechevaE Kocova M Toncheva D and Naka-mura Y Identification of molecular targetsfor treatment of thyroid carcinoma OncolRep 20 105-121 2008

30 Nakamura Y Pharmacogenomics and drugtoxicity (Editorial) New Eng J Med 359856-858 2008

31 Arita K Ariyoshi M Tochio H Naka-mura Y and Shirakawa M Hemi-methylated DNA recognition by the SRAprotein Np95 via a base flipping mecha-nism Nature 455 818-821 2008

32 Inoue H Iga M Nabeta H Yokoo TSuehiro Y Okano S Inoue M Kinoh HKatagiri T Takayama K Yonemitsu YHasegawa M Nakamura Y Nakanishi Yand Tani K Non-transmissible SeV encod-ing GM-CSF is a novel and potent vectorsystem to produce autologous tumor vac-cines Cancer Sci 99 2315-2326 2008

33 Konda R Sugimura J Sohma F Katagiri TNakamura Y Fujioka T Over expression ofhypoxia-inducible protein 2 hypoxia-inducible factor-1αand nuclear factor κBis putatively involved in acquired renal cystformation and subsequent tumor transfor-mation in patients with end stage renal fail-ure J Urol 180 481-485 2008

34 Hotta K Nakata Y Matsuo T KamoharaS Kotani K Komatsu R Itoh N MineoI Wada J Masuzaki H Yoneda MNakajima A Miyazaki S Tokunaga KKawamoto M Funahashi T HamaguchiK Yamada K Hanafusa T Oikawa SYoshimatsu H Nakao K Sakata T Mat-suzawa Y Tanaka K Kamatani N andNakamura Y Variations in the FTO gene areassociated with severe obesity in the Japa-nese J Hum Genet 53 546-553 2008

35 Kato M Nakamura Y and Tsunoda T Analgorithm for inferring complex haplotypesin a region of copy-number variation Am JHum Genet 83 157-169 2008

36 Kato M Nakamura Y and Tsunoda TMOCSphaser a haplotype inference tool

138

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 24: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

from a mixture of copy number variationand single nucleotide polymorphism dataBioinformatics 24 1645-1646 2008

37 Yasuda K Miyake K Horikawa Y HaraK Osawa H Furuta H Hirota Y MoriH Jonsson A Sato Y Yamagata K Hi-nokio Y Wang HY Tanahashi T Naka-mura N Oka Y Iwasaki N Iwamoto YYamada Y Seino Y Maegawa H Kashi-wagi A Takeda J Maeda E Shin HDCho YM Park KS Lee HK Ng MCMa RC So WY Chan JC Lyssenko VTuomi T Nilsson P Groop L KamataniN Sekine A Nakamura Y Yamamoto KYoshida T Tokunaga K Itakura M Mak-ino H Nanjo K Kadowaki T and KasugaM Variants in KCNQ1 are associated withsusceptibility to type 2 diabetes mellitusNat Genet 40 1092-1097 2008

38 Yamaguchi-Kabata Y Nakazono K Taka-hashi A Saito S Hosono N Kubo MNakamura Y and Kamatani N Japanesepopulation structure based on SNP geno-types from 7003 individuals compared toother ethnic groups Effects on population-based association studies Am J HumGenet 83 445-456 2008

39 Okada Y Mori M Yamada R Suzuki AKobayashi K Kubo M Nakamura Y andYamamoto K SLC22A4 polymorphism andrheumatoid arthritis susceptibility A replica-tion study in a Japanese population and ametaanalysis J Rheumatol 35 1723-17282008

40 Omori S Tanaka Y Takahashi A HiroseH Kashiwagi A Kaku K Kawamori RNakamura Y and Maeda S Association ofCDKAL1 IGF2BP2 CDKN2AB HHEXSLC30A8 and KCNJ11 with susceptibility oftype 2 diabetes in a Japanese populationDiabetes 57 791-795 2008

41 Misawa K Fujii S Yamazaki T Taka-hashi A Takasaki J Yanagisawa M Oh-nishi Y Nakamura Y and Kamatani NNew correction algorithms for multiple com-parisons in case-control multilocus associa-tion studies based on haplotypes and diplo-type configurations J Hum Genet 53 789-801 2008

42 Chantarangsu S Mushiroda T Mahasiri-mongkol S Kiertiburanakul S Sungkanu-parph S Manosuthi W Tantisiriwat WCharoenyingwattana A Sura T Chan-tratita W and Nakamura Y HLA-B 3505allele is a strong predictor for nevirapine-induced skin adverse drug reactions in ThaiHIV-infected patients Pharmacogenet Genomics 19 139-146 2009

43 Suzuki A Yamada R Kochi Y Sawada

T Okada Y Matsuda K Kamatani YMori M Shimane K Hirabayashi YTakahashi A Tsunoda T Miyatake AKubo M Kamatani N Nakamura Y andYamamoto K Functional SNPs in CD244 in-crease the risk of rheumatoid arthritis in aJapanese population Nat Genet 40 1224-1229 2008

44 Yamazaki K Takahashi A Takazoe MKubo M Onouchi Y Fujino A KamataniN Nakamura Y and Hata A Positive asso-ciation of genetic variants in the upstreamregion of NXT2-3 with Crohnrsquos disease inJapanese patients Gut 58 228-232 2009

45 Nikolova DN Doganov N Dimitrov RAngelov K Kee LS Dimova I TonchevaD Nakamura Y and Zembutsu HGenome-wide gene expression profiles ofovarian carcinoma identification of molecu-lar targets for treatment of ovarian carci-noma Mol Med Rep in press 2008

46 Hotta K Nakamura M Nakata Y Mat-suo T Kamohara S Kotani K KomatsuR Itoh N Mineo I Wada J MasuzakiH Yoneda M Nakajima A Miyazaki STokunaga K Kawamoto M Funahashi THamaguchi K Yamada K Hanafusa TOikawa S Yoshimatsu H Nakao KSakata T Matsuzawa Y Tanaka K Ka-matani N and Nakamura Y INSIG2 geners7566605 polymorphism is associated withsevere obesity in Japanese J Hum Genet53 857-862 2008

47 Iwahori K Osaki T Serada S FujimotoM Suzuki H Kishi Y Yokoyama A Ha-mada H Fujii Y Yamaguchi KHirashima T Matsui K Tachibana INakamura Y Kawase I and Naka TMegakaryocyte potentiating factor as a tu-mor maker of malignant pleural mesothe-lioma Evaluation in comparison with meso-thelin Lung Cancer 62 45-54 2008

48 Hirota T Harada M Sakashita M DoiS Miyatake A Fujita K Enomoto TEbisawa M Yoshihara S Noguchi ESaito H Nakamura Y and Tamari M Ge-netic polymorphism regulating ORM1-like 3(Saccharomyces cerevisiae) expression is as-sociated with childhood atopic asthma in aJapanese population J Allergy Clin Immu-nol 121 769-770 2008

49 Harada M Hirota T Jodo AI Doi SKameda M Fujita K Miyatake A Eno-moto T Noguchi E Yoshihara SEbisawa M Saito H Matsumoto KNakamura Y Ziegler SF and Tamari MFunctional analysis of the Thymic StromalLymphopoietin Variants in Human Bron-chial Epithelial Cells Am J Respir Cell

139

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 25: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

Mol Biol 40 368-374 200950 Sakashita M Yoshimoto T Hirota T Ha-

rada M Okubo K Osawa Y Fujieda SNakamura Y Yasuda K Nakanishi Kand Tamari M Association of serum IL-33level and the IL-33 genetic variant withJapanese cedar pollinosis Clin Exp Allergy38 1875-1881 2008

51 Hirata D Yamabuki T Miki D Ito TTsuchiya E Fujita M Hosokawa MChayama K Nakamura Y and Daigo YInvolvement of epithelial cell transformingsequence-2 oncoantigen in lung and esopha-geal cancer progression Clin Cancer Res15 256-266 2009

52 Dobashi S Katagiri T Hirota E AshidaS Daigo Y Shuin T Fujioka T Miki Tand Nakamura Y Involvement of TMEM22overexpression in the growth of renal cellcarcinoma cells Oncol Rep 21 305-3122009

53 Zembutsu H Suzuki Y Sasaki ATsunoda T Okazaki M Yoshimoto MHasegawa T Hirata K and Nakamura YPredicting response to Docetaxel neoadju-vant chemotherapy for advanced breast can-cers through genome-wide gene expressionprofiling Int J Oncol 34 361-370 2009

54 Nakamura Y DNA variations in humanand medical genetics 25 years of my experi-ence (review) J Hum Genet 54 1-8 2009

55 Ozaki K Sato H Inoue K Tsunoda TSakata Y Mizuno H Lin T-H Mi-yamoto Y Aoki A Onouchi Y Sheu S-H Ikegawa S Odashiro K NobuyoshiM Juo S-H H Hori M Nakamura Yand Tanaka TA functional variation inBRAP confers risk of myocardial infarctionin Asian populations Nat Genet in press2009

56 Kashiwaya K Hosokawa M Eguchi HOhigashi H Ishikawa O Shinomura YNakamura Y and Nakagawa H Identifica-tion of C2orf18 Termed ANT2BP (ANT2-binding protein) as one of key molecules in-volved in pancreatic carcinogenesis CancerSci 100 457-464 2009

57 Nagayama S Yamada E Kohno YAoyama T Fukukawa C Kubo HWatanabe G Katagiri T Nakamura YSakai Y and Toguchida J Inverse correla-tion of the upregulation of FZD10 expres-sion and the activation of β-catenin in syn-chronous colorectal tumors Cancer Sci inpress 2009

58 Ueda K Fukase Y Katagiri T IshikawaN Irie S Sato T Ito H Nakayama HMiyagi Y Tsuchiya E Kohno N ShiwaM Nakamura Y and Daigo Y Targeted

glycoproteomics for the discovery of lungcancer-associated glycosylation disorders us-ing lectin-coupled ProteinChip arrays Pro-teomocs in press 2009

59 The International Warfarin Pharmacogenet-ics Consortium Improved warfarin dosingwith a global pharmacogenetic algorithm NEngl J Med 360 753-764 2009

60 Betcheva ET Mushiroda T Takahashi AKubo M Karachanak SK Zaharieva ITVazharova RV Dimova II Milanova VK Tolev T Kirov G Owenm MJOrsquoDonovanm MC Kamatanim N Naka-mura Y and Toncheva DI Case-control as-sociation study of 59 candidate genes re-veals the DRD2 SNP rs6277 (C957T) as theonly susceptibility factor for schizophreniain Bulgarian population J Hum Genet 5498-107 2009

61 Fukukawa C Nagayama S Tsunoda TToguchida J Nakamura Y and Katagiri TActivation of non-canonical Dvl-Rac1-JNKpathway by Frizzled-homologue 10 (FZD10)in human synovial sarcoma Oncogene inpress 2009

62 Yosifova A Mushiroda T Stoianov DVazharova R Dimova I Karachanak SZaharieva I Milanova V Madjirova NGerdjikov I Tolev T Velkova S KirovG Owen MJ OrsquoDonovan MC TonchevaD and Nakamura Y Case-control associa-tion study of 65 candidate genes revealed apossible association of a SNP of HTR5A tobe a factor susceptible to bipolar disease inBulgarian population J Affective Disordersin press 2009

63 Kamatani Y Wattanapokayakit S OchiH Kawaguchi T Takahashi A HosonoN Kubo M Tsunoda T Kamatani NKumada H Puseenam A Sura T DaigoY Chayama K Chantratita W Naka-mura Y and Matsuda K Identification ofassociation of genetic variations in HLA-DPlocus with chronic hepatitis B in Asianpopulation through genome-wide associa-tion study Nat Genet in press 2009

64 Tamura K Furihata M Chung S Ue-mura M Yoshioka H Iiyama T AshidaS Nasu Y Fujioka T Shuin T Naka-mura Y and Nakagawa H Stanniocalcin 2( STC 2 ) over-expression in castration-resistant prostate cancer and aggressiveprostate cancer Cancer Sci in press 2009

65 Tsukada H Ochi H Maekawa T AbeH Fujimoto Y Tsuge M Takahashi HKumada H Kamatani N Nakamura Yand Chayama K Hiroshima Liver StudyGroup Toranomon Hospital A Polymor-phism in MAPKAPK3 affects response to in-

140

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 26: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

terferon therapy for chronic hepatitis C Gas-troenterology in press 2009

66 Dunleavy EM Roche D Tagami H La-coste N Ray-Gallet D Nakamura YDaigo Y Nakatani Y and Almouzni-

Pettinotti G HJURP a key CENP-A-partnerfor maintenance and deposition of CENP-Aat centromeres at late telophaseG1 Cell inpress 2009

141

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 27: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

Genetic heterogeneity of human beings is one of the most important targets ofpost-genomic research Genome-wide association studies are being actively car-ried out using the genetic polymorphism markers to identify disease-related lociWe focus on the development of new methods to interpret the heterogeneity andto map the disease-associated loci and collaborate with research groups for data-mining of their genetic epidemiology studies

1 The development of new methods to mapdisease-associated loci with genetic poly-morphisms

Ryo Yamada

Genome-wide association (GWA) studies areresulting in many useful findings The scale ofsuch studies is increasing along with rapid pro-gress in genotyping technology This increase inscale necessarily increases the degree of depend-ence among individual tests in GWA studiesThe inter-test dependence is problematic be-cause almost all the conventional statisticalmethods assume independence among multipletests Besides the multiple sources of inter-testdependency the variable inflation of test statis-tics due to biased sampling from structuredpopulation is one of the unavoidable conse-quences of enlarged sample size These prob-lems that complicate the interpretation of dataof GWA studies are mutually related and thereis no straight-forward solution of them all to-gether We decompose the difficulty into partsie the problem of linkage disequilibrium (LD)population structure multiple genetic modelsstudy design and characterize their problem andpropose solution of the individual problems at

the beginning and also attempt to improve theinterpretation of data of GWA studies as awhole

a Test statistics correction for data of struc-tured population

Because the genetic epidemiology studies oncomplex genetic traits target relatively weak fac-tors which means sample size of them shouldbe more than thousands and subsequentlymakes idealistic random sampling from homo-geneous population impossible The test statis-tics of the studies in the heterogeneous popula-tion in other words structured populationtends to give false positive results One of themethods to correct the increase in the false posi-tives is genomic control method for chi-squaredistribution We modify the genomic controlmethod so that it could correct the Fisherrsquos exacttest statistics

b Characterization of exact 2times3 test for SNPcase-control association test data

The 2times3 contingency table test of SNP data isthe basic unit of genome-wide association stud-ies We investigate the factors to affect the dis-

Human Genome Center

Laboratory of Functional Genomicsゲノム機能解析分野

Visiting Professor Gregory Mark Lathrop PhDAssociate Professor Ryo Yamada MD PhD

客員教授 理学博士 グレゴリーマークラスロップ准教授 医学博士 山 田 亮

142

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 28: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

crepancy between the asymptotic test and theexact test for 2times3 contingency tables

c Geometric evaluation of SNP contingencytable tests

The 2times3 SNP contingency table tests are de-scribed in the context of geometry and charac-terize various tests for 2times3 tables and definetests fit for biological models by interpreting ta-bles in the context of geometry

2 The development of new methods to inter-pret the genetic heterogeneity

Ryo Yamada

As a compound in nature the DNA sequenceis under pressure to maximize the heterogeneityof the sequence Under the most random condi-tion all bases of the sequence would be poly-morphic and all bases and all sets of bases aremutually independent At the other extreme un-der the least random condition all DNA mole-cules would be clones In living organisms thenumber of polymorphic sites in the DNA se-quence is limited due to the requirements for re-production and as a result of selection and ge-netic drift against which opposite forces act toincrease heterogeneity (eg mutation and re-combination) A major research target followingthe completion of the genome sequence is theinvestigation of intra-species variations amongwhich diallelic single nucleotide polymorphismsare the most common

a Quantitation of linkage disequilibrium ofmultiple markers

Genetic variations within a population giverise to LD and the use of the genetic history ofthe population and LD mapping is a very prom-ising method for identifying genetic back-grounds of various phenotypes LD is a measureof inter-marker dependence Although the inter-marker dependence exist among any set ofmarkers only the pair-wise inter-marker de-pendence is utilized for quantitation of the ge-netic heterogeneity and for genetic epidemiol-ogy studies usually We develop a new method

to quantify the heterogeneity and complexity ofpopulation of DNA sequence with SNPs so thatvarious researches based on genetic heterogene-ity

b Geometric expression of haplotype popu-lations

Haplotypes are consisted of alleles of multiplemarkers We attempt to deal the haplotype datafrom combination theory standpoint and investi-gated the utility of polyhedral handling of thecombinatorial aspects of haplotypes

3 Collaboration with genetic epidemiologyresearch groups

Gregory Mark Lathrop and Ryo Yamada

Besides the development of new methods toanalyze genetic polymorphism data in the con-text of population genetics and genetic statisticswe collaborate with multiple research groups inand out of the IMS-UT including Kyoto Univer-sity Kyoto The University of Tokyo HospitalTokyo Laboratory for Autoimmune DiseasesCGM RIKEN Yokohama National Hospital Or-ganization Sagamihara National Hospital Sa-gamihara and The Centre National de Geacuteno-typage Evry France for the interpretation ofgenetic epidemiology data with the conventionalstatistical methods

4 Public distribution of population geneticsand genetic association study tools

Ryo Yamada

Because the designs of genetic epidemiologystudies have been changing the analysis toolshave to be updated all the time The number ofgenetic epidemiology study groups is muchmore than the groups on genetic statistics in theworld and also in Japan We opened the website that distributes basic tool of linkage dise-quilibrium mapping for public use This distri-bution is supported by the grant from Japan So-ciety for the Promotion of Science on the permu-tation test

Web-site URL httpfunc-genhgcjp

Publications

Gotoh N Yamada R Matsuda F Yoshimura Nand Iida T Manganese Superoxide DismutaseGene (SOD2) Polymorphism and ExudativeAge-related Macular Degeneration in theJapanese Population Am J Ophthalmol 146

146 2008Nakayama-Hamada M Suzuki A Furukawa H

Yamada R and Yamamoto K Citrullinated fi-brinogen inhibits thrombin-catalyzed fibrinpolymerization J Biochem 144 393-8 2008

143

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 29: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

Okada Y Mori M Yamada R Suzuki A Kobay-ashi K Kubo M Nakamura Y and YamamotoK SLC22A4 Polymorphism and RheumatoidArthritis Susceptibility A Replication Study ina Japanese Population and a Metaanalysis JRheumatol 35 1273-8 2008

Shimane K Kochi Y Yamada R Okada YSuzuki A Miyatake A Kubo M Nakamura Yand Yamamoto K A single nucleotide poly-morphism in the IRF5 promoter region is as-sociated with susceptibility to rheumatoid ar-thritis in the Japanese patients Ann RheumDis (in press)

Suzuki A Yamada R Kochi Y Sawada T

Okada Y Matsuda K Kamatani Y Mori MShimane K Hirabayashi Y Takahashi ATsunoda T Miyatake A Kubo M KamataniN Nakamura Y and Yamamoto K FunctionalSNPs in CD244 increase the risk of rheuma-toid arthritis in a Japanese population NatGenet 40 1224-9 2008

Yamada R Primer SNP-associated studies andwhat they can teach us Nat Clin Pract Rheu-matol 4 210-7 2008

Yamada R and Okada Y An optimal dose-effectmode trend test for SNP genotype tablesGenet Epidemiol 33 114-27 2009

144

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 30: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

The mission of our laboratory is to conduct computational ( ldquoin silicordquo) studies onthe functional aspects of genome information Roughly speaking genome informa-tion represents what kind of proteinsRNAs are synthesized on what conditionsThus our study includes the structural analysis of molecular function of each geneproduct as well as the analysis of its regulatory information which will lead us tothe understanding of its cellular role represented by the networks of inter-gene in-teraction

1 Tissue and developmental stage specific-ity of trans-splicing in C intestinalis

Nicolas Sierro Shuang Li Yutaka Suzuki1 RiuYamashita and Kenta Nakai 1GraduateSchool of Frontier Sciences U Tokyo

Ciona intestinalis is a useful model organism toanalyze chordate development and geneticsHowever unlike vertebrates it shares a uniquemechanism called trans-splicing with lower eu-karyotes Our computational analysis of trans-splicing in C intestinalis showed that althoughthe amount of non-trans-spliced and trans-spliced genes is usually equivalent the expres-sion ratio between the two groups varies signifi-cantly with tissues and developmental stagesAmong the seven tissues studied the observedratios ranged from 253 in ldquogonadrdquo to 1953 inldquoendostylerdquo and during development they in-creased from 168 at the ldquoeggrdquo stage to 755 atthe ldquojuvenilerdquo stage We hypothesize that thisenrichment in trans-spliced mRNAs in early de-velopmental stages might be related to theabundance of trans-spliced mRNAs in ldquogonadrdquoTo further investigate this phenomenon we arecurrently analyzing a larger set of short 5rsquo-ESTtags obtained from specific tissues and develop-

mental stages

2 Improvement of the database of tunicategene regulation

Nicolas Sierro Takehiro Kusakabe2 YutakaSuzuki1 Riu Yamashita and Kenta Nakai 2

University of Hyogo

The database of tunicate gene regulationDBTGR was first released in 2006 as a small da-tabase summarizing published informationabout tunicate promoters and cis-regulatory re-gions In 2008 it was extended to include geneexpression reporter constructs as well as a newgenome browser providing all whole genomealignments between Ciona intestinalis and Cionasavignyi The description of 81 gene expressionreporter vectors as well as sample images of theexpression observed with them in Ciona is nowavailable and the database provides users withcontact information to the owners of these con-structs With the new flexible genome browserbuilt in DBTGR users have now access to twodifferent genome alignments between C intesti-nalis and C savignyi obtained with different al-gorithms In addition predicted binding sites forthe JASPAR core matrices as well as regulatory

Human Genome Center

Laboratory of Functional Analysis In Silico機能解析インシリコ分野

Professor Kenta Nakai PhDAssociate Professor Kengo Kinoshita PhD

教 授 理学博士 中 井 謙 太准教授 理学博士 木 下 賢 吾

145

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 31: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

elements and binding sites reported in literatureare also directly available DBTGR is accessibleat httpdbtgrhgcjp

3 Promoter architecture analysis and predic-tion of expression

Alexis Vandenbon and Kenta Nakai

Regulation of transcription is implementedthrough transcription factors (TFs) binding regu-latory regions in the neighborhood of genes Wecan make the assumption that genes showingsimilar expression profiles contain some sharedstructural patterns in their regulatory regionsUntil recently these patterns were consideredonly on the level of presence or absence of spe-cific transcription factor binding sites (TFBSs)but there is growing evidence that additionalstructural patterns exist Here we are focusingour attention not only on the presence of TFBSsbut also on their orientation and positioningwith regard to the transcription start site andalso between pairs of TFBSs We developed anapproach for extracting such structural motifsfrom promoter sequences and subsequentlycombining them to make a promoter structuremodel We applied our model on a dataset ofpromoter sequences of muscle-specific genes ofCaenorhabditis elegans and verified that ourmodel is capable of distinguishing muscle-expressed genes from genes not expressed inmuscle tissues based on the structure of theirregulatory regions We are further developingour model and runs on Mus musculus datasetsindicate that the approach is applicable in mam-mals too

4 Characterization and definition of promo-ter-associated CpG islands in ascidiangenomes

Kohji Okamura Riu Yamashita Koki Nishit-suji2 Yutaka Suzuki1 Takehiro Kusakabe2 andKenta Nakai

While CpG islands are often linked to a pro-moter in mammals their existence in inverte-brates is unclear Since there is a striking differ-ence in DNA methylation pattern between ver-tebrates and invertebrates which show globaland fractional methylation respectively thefunction of methylation per se in the latter groupis also elusive To address these questions weperformed determination of TSSs of ascidiangenes by combination of the oligo-cappingmethod and massive-scale cDNA sequencing Asa result we found characteristic features of as-cidian promoters They tend to be G+C- and

CpG-rich but over a narrower range around theTSSs Furthermore almost all promoters fall intothe same category whereas vertebrate promot-ers are divided into two classes in terms ofCpG Comparison of the experimental resultwith the genome of another ascidian speciesalso supported our finding leading to the firstdefinition of promoter-associated CpG islands ininvertebrate organisms

5 Computational verifications of gene regu-latory networks in ascidian early develop-ment

Xuyang Yuan Atsushi Kubo3 Yutaka Satou3and Kenta Nakai 3Kyoto University

The ascidian Ciona intestinalis has been usefulas a model system to explore chordate develop-ment Systematic gene knockdown experimentshighly contributed to the depiction of the generegulatory network governing ascidian early de-velopment However limitations of the experi-ment itself prevent the blueprint from givingfurther information regarding direct or indirectregulation In this study we are computation-ally detecting direct target genes of each tran-scription factor by scanning all promoter se-quences for its binding site For representing thesequence specificity of transcription factors weutilized positional weight matrices of whichthreshold values we need to set We maximizedan over-representation index (ORI) value to findthe optimum threshold For trans-acting factorswhose binding sites are unknown but haveorthologues with known binding sites we arepredicting them by the examination of ortho-logues The regulation network of C intestinalistranscription factor ZicL is consistent with thedata of a newly produced ChIP-chip experi-ment Using our method together with ChIP-chip data we further expanded the original net-work to cover all 16000 C intestinalis genes Sothat not only the kernel components of the regu-latory network making body plan but also pe-ripheral components which actually make build-ing block of the body are included

6 Pseudocounts for transcription factor bin-ding sites

Keishin Nishida Martin Frith4 and KentaNakai 4CBRC AIST

To represent the sequence specificity of tran-scription factors the position weight matrix(PWM) is widely used In most cases each ele-ment is defined as a log likelihood ratio of abase appearing at a certain position which is es-

146

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 32: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

timated from a finite number of known bindingsites To avoid bias due to this small samplesize a certain numeric value called a pseudo-count is usually allocated for each position andits fraction according to the background basecomposition is added to each element So farthere has been no consensus on the optimalpseudocount value In this study we simulatedthe sampling process by artificially generatingbinding sites based on observed nucleotide fre-quencies in a public PWM database and thenthe generated matrix with an added pseudo-count value was compared to the original fre-quency matrix using various measures Al-though the results were somewhat different be-tween measures in many cases we could findan optimal pseudocount value for each matrixThese optimal values are independent of thesample size and are clearly anti-correlated withthe information content of the original matricesmeaning that larger pseudocount vales are pref-erable for less conserved binding sites As a sim-ple representative we suggest the value of 08for practical uses

7 Definition and analysis of alternative pro-moters using a huge number of TSS infor-mation

Riu Yamashita Yutaka Suzuki1 HiroyukiWakaguri1 Sumio Sugano1 Kenta Nakai

In order to support transcriptional studies wehave constructed a database DataBase of Tran-scriptional Start Sites (DBTSS httpdbtsshgcjp) which includes a number of 5rsquo-end se-quences produced by oligo-capping method Re-cently we have added 2965 million tags fromeight kinds of cells (15 kinds of experimentalconditions) using a SOLEXA sequencer Herewe performed analysis of alternative promoterswith these data From these data we obtained75918 promoters These promoters could beclassified into 36251 gene regions and 39667 in-tergenic regions Former intragenic promoterscorresponded to 14307 genes and 5428 of themhave one promoter and 8879 genes have morethan one promoter For each gene we definedthe promoter with the largest number of tags asthe lsquo1st promoterrsquo and the 2nd highest promoteras the lsquo2nd promoterrsquo Between different celltypes the average percentage of the discrepancyfor 1st and 2nd promoters was 283 On theother hand we observed 96 of difference forpromoters expressed in the same cell types withdifferent conditions These results indicate thatthe expression ratio of promoters is conservedamong cells We also observed that 2nd promot-ers preferentially occur in downstream regions

of 1st promoters

8 Effects of Alu elements on global nucle-osome positioning in the human genome

Yoshiaki Tanaka Riu Yamashita and KentaNakai

Because chromatin can limit the accessibilityof regulatory sites understanding the genomesequence-specific positioning of nucleosome isimportant for the analyses of transcription andreplication It has been previously reported thatthe 10-bp dinucleotide periodicities are stronglyassociated with nucleosome positioning but it isunknown whether these features can affect invivo nucleosome locations through the wholtegenomes of all eukaryote Fourier analysis to thegenome fragments indicates that these are notcommon in 16 eukaryotes but the two primate-specific periodicities (84-bp and 167-bp) are ob-served The 167 bp is similar with the sum ofthe lengths of a nucleosome unit and its linkerregion After masking Alu elements these perio-dicities were greatly diminished Therefore wenext analyzed the distribution of nucleosomes inthe vicinity of them Using two independentlarge-scale sets of recently published nucleo-some mapping data we found that (1) there areone or two fixed slot(s) for nucleosome position-ing within the Alu element and (2) the position-ing of neighboring nucleosomes seems to be inphase more or less with the presence of Aluelements Our study provides an important clueto understanding the whole chromatin composi-tion of the primate genomes

9 Estimation and Comparison of minimalcellular function sets for bacteria and eu-karyotes

Yusuke Azuma and Kenta Nakai

A minimal cell containing only necessary andsufficient components has been estimatedmostly by the reduction of the genome of a liv-ing cell But the ldquominimal gene setrdquo obtained bythe former approach may be inaccurate due tothe effect of evolution Thus we tried to detectthe minimal cellular function instead As cellu-lar functions we used KEGG pathway mapsThe minimal pathway maps were detected as acombination of the conserved pathway mapsand the organism-specific pathway maps Theconserved pathway maps are those containingmore orthologous genes in all pathway mapsand are estimated by homology searches Theyshould be close to the minimal pathways but itis not sure whether they are organized to sus-

147

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 33: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

tain life from only external nutrients like livingcells Then the organism-specific pathway mapsare detected as those that can synthesize com-pounds required for the conserved pathwaymaps from nutrients The minimal pathwaymaps detected for bacteria agree well with theexperimental essential genes Most of the catabo-lization pathways were selected as organism-specific pathways rather than conserved onessuggesting that they are adapted to each envi-ronment The minimal pathway maps of eukary-otes contain more pathway maps for DNA re-pair than those of bacteria In addition there aremore links in the pathways of eukaryotes Thusit is likely that eukaryotes need to be more sta-ble genetically

10 Development of new indices to evaluateprotein-protein interfaces Assemblingspace volume assembling space dis-tance and global shape descriptor

M Maeda5 and K Kinoshita 5National Insti-tute of Agrobiological Sciences

Protein-protein interaction is an initial step torealize complex biological functions thereforeunderstanding of the protein-protein interfaceswill give us a clue to predict the protein com-plex structures For the purpose efficient de-scriptors of the interface and database analysesare important In this study we developed threenew descriptors of protein-protein interfacesthat is assembling space volume assemblingspace distance and global shape descriptor byusing Delaunay tessellation technique The firsttwo indexes enable us to evaluate how well theprotein interfaces are build up and the third de-scriptor quantifies the complexity of the protein-protein interfaces Systematic comparison withsome existing descriptors our indexes could elu-cidate the different aspects of the protein inter-faces

11 ATTED-II a coexpression database forArabidopsis

T Obayashi S Hayashi6 M Saeki6 H Ohta6K Kinoshita 6Tokyo Institute of Technology

ATTED-II (httpattedjp) is a database ofgene coexpression in Arabidopsis that can beused to design a wide variety of experimentsincluding the prioritization of genes for func-tional identification or for studies of regulatoryrelationships Here we report updates ofATTED-II that focus especially on functionalitiesfor constructing gene networks with regard tothe following points (i) introducing a new

measure of gene coexpression to retrieve func-tionally related genes more accurately (ii) im-plementing clickable maps for all gene networksfor step-by-step navigation (iii) applying GoogleMaps API to create a single map for a large net-work (iv) including information about protein-protein interactions (v) identifying conservedpatterns of coexpression and (vi) showing andconnecting KEGG pathway information to iden-tify functional modules With these enhancedfunctions for gene network representationATTED-II can help researchers to clarify thefunctional and regulatory networks of genes inArabidopsis

12 PiSite a database of protein interactionsites using multiple binding states in thePDB

M Higurashi T Ishida and K Kinoshita

The vast accumulation of protein structuraldata has now facilitated the observation ofmany different complexes in the PDB for thesame protein Therefore a single protein com-plex is not sufficient to identify their interactionsites especially for proteins with multiple bind-ing states or different partners such as hub pro-teins Thus we developed a database that pro-vides protein-protein interaction sites at the resi-due level with consideration of multiple com-plexes at the same time by mapping the bind-ing sites of all complexes containing the sameprotein in the PDB We also implemented easyweb-interfaces with an interactive viewer work-ing with typical web-browsers and the differentbinding modes can be checked visually

13 Discrimination between biological inter-faces and crystal-packing contacts

Y Tsuchiya H Nakamura7 and K Kinoshita7Osaka University

The quaternary structures of proteins are thebases of their physiological functions and thusit is indispensable to know the biologically rele-vant complexes of proteins to understand theirfunctions at the molecular level The structuresof proteins are usually determined by X-raycrystallography which could contain non-biological interactions due to the nature of crys-tals Therefore discrimination between biologi-cally relevant interfaces and artificial crystal-packing contacts in crystal structures is re-quired We developed a discrimination methodbetween biological and non-biological interfaceswhich evaluates protein-protein interfaces interms of complementarities for hydrophobicity

148

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 34: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

electrostatic potential and shape on the proteinsurfaces and chooses the most probable biologi-cal interfaces among all possible contacts in thecrystal Our discrimination method achieved agood success rate comparable to that of the con-tact area-dependent discrimination Subsequentdetailed review of the discrimination resultsraised the success rate to 914

14 Effect of surface-to-volume ratio of pro-teins on hydrophilic residues

M Shirota T Ishida and K Kinoshita

The size of a protein has been shown to affectboth the amino acid composition and the resi-due burial in the protein To demonstrate thatthese effects are the results from the reductionof surface regions relative to the volume inlarger proteins we examined the effect ofsurface-to-volume ratio (SVR) which is the ratiobetween the accessible surface area and volumeof a protein to amino acid composition The re-duction of several hydrophilic residues wasmore strongly correlated with SVR than withprotein size (ie the number of amino acids)which indicats that SVR directly affected theamino acid composition Furthermore these hy-drophilic residues also increased in buried frac-tion at the same time of the reduction The in-crease in burial was found to be acceleratedcompared with the decrease in occurrence asSVR decreased below SVR=03Å-1 (approxi-mately protein size exceeded 132 residues) ex-cept for lysine which was the most difficult forbeing buried

15 Prediction of disordered regions in pro-teins based on the meta approach

Takashi Ishida and Kengo Kinoshita

Intrinsically disordered regions in proteinshave no unique stable structures without theirpartner molecules thus these regions sometimesprevent high-quality structure determinationFurthermore proteins with disordered regionsare often involved in important biological proc-esses and the disordered regions are consideredto play important roles in molecular interac-tions Therefore identifying disordered regionsis important to obtain high-resolution structuralinformation and to understand the functionalaspects of these proteins Thus we developed anew prediction method for disordered regionsin proteins based on the meta approach and im-plemented a web-server for this predictionmethod The method predicts the disorder ten-dency of each residue using support vector ma-

chines from the prediction results of the sevenindependent predictors As a result of ourevaluation the meta approach achieved higherprediction accuracy than previously developedmethods

16 A cavity with an appropriate size is thebasis of the PPIase activity

Teikichi Ikura8 Kengo Kinoshita NobutoshiIto8 8Tokyo Medical and Dental University

Peptidyl-prolyl isomerases (PPIase) are impor-tant enzymes in biological systems but the cata-lytic mechanisms are not well understood Toelucidate the essential amino acids for the enzy-matic activities we have carried out the similar-ity search of atomic configurations of the activesite of PPIase against the known protein struc-tures and found alpha amylase and prolyl en-dopeptidase have the similar spatial arrange-ment of atoms with PPIase active sites Further-more we proved experimentally that these pro-teins actually have the PPIase activities whichhave not been considered at all In addition wecreated the similar hole in the barnase which isa enzyme to catalyze the ribonuclease activityand does not have the PPIase activities andfound that the mutated barnase exhibit the PPI-ase activity These results indicate that the PPI-ase activity can be realized by a hole with ap-propriate size on the surface of protein

17 COXPRESdb co-expressed gene data-base for mouse and human

T Obayashi S Hayashi6 M Shibaoka6 MSaeki6 H Ohta6 K Kinoshita

A database of coexpressed gene sets can pro-vide valuable information for a wide variety ofexperimental designs such as targeting of genesfor functional identification gene regulationandor protein-protein interactions Coexpre-ssed gene databases derived from publicly avail-able GeneChip data are widely used in Arabi-dopsis research but platforms that examine co-expression for higher mammals are rather lim-ited Therefore we have constructed a new da-tabase COXPRESdb (coexpressed gene data-base) (httpcoxpresdbhgcjp) for coexpressedgene lists and networks in human and mouseCoexpression data could be calculated for 19 777and 21 036 genes in human and mouse respec-tively by using the GeneChip data in NCBIGEO COXPRESdb enables analysis of the fourtypes of coexpression networks (i) highly coex-pressed genes for every gene (ii) genes with thesame GO annotation (iii) genes expressed in the

149

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 35: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

same tissue and (iv) user-defined gene setsWhen the networks became too big for the staticpicture on the web in GO networks or in tissuenetworks we used Google Maps API to visual-ize them interactively COXPRESdb also pro-vides a view to compare the human and mousecoexpression patterns to estimate the conserva-tion between the two species

18 Influence of proteins and cholesterol onbiological membranes analyzed by mo-lecular dynamics

Naoya Fujita Takashi Ishida and Kengo Ki-noshita

Protein-membrane interactions are fundamen-tal for both protein functions and membraneproperties By means of these interactions suit-

able configurations of membrane molecules cangenerate heterogeneity such as lipid rafts andtransportsome regions in the membrane To re-veal the bidirectional influences between pro-teins and surrounding lipids we performed mo-lecular dynamics simulations of biological mem-branes with and without proteins and choles-terol and compared those trajectories As a re-sult alamethicin a small transmembrane pep-tide was shown to reduce the whole membraneundulation in addition to decreasing localmembrane thickness according to the size ofalamethicinrsquos hydrophobic region On the con-trary water accessibility of alamethicin and itshydrogen bonds with lipids were different de-pending on the cholesterol availability Furtherinvestigations with aquaporin are also beingperformed

Publications

Chiba H Yamashita R Kinoshita K andNakai K Weak correlation between sequenceconservation in promoter regions and inprotein-coding regions of human-mouseorthologous gene pairs BMC Genomics 9 1522008

Genome Information Integration Project and H-invitational 2 Consortium The H-InvitationalDatabase (H-InvDB) a comprehensive annota-tion resource for human genes and tran-scripts Nucl Acids Res 36 D793-D799 2008

Hatada I Morita S Kimura M Horii TYamashita R and Nakai K Genome-widedemethylation during neural differentiation ofP19 embryonal carcinoma cells J HumanGenet 53 (2) 185-191 2008

Hatanaka Y Nagasaki M Yamaguchi RObayashi T Numata K Imoto S Shima-mura T Kinoshita K Nakai K and Miy-ano S A novel strategy to search concertedtranscription factor activities using gene ex-pression profile and genomic data Genome In-formatics 20 212-221 2008

Higurashi M Ishida T and Kinoshita KPiSite a database of protein interaction sitesusing multiple binding states in the PDB Nu-cleic Acids Res 37 D360-364 2009

Ikura T Kinoshita K and Ito N A cavity withan appropriate size is the basis of the PPIaseactivity Protein Eng Des Sel 21 83-89 2008

Ishida T and Kinoshita K Prediction of disor-dered protein regions based on meta-approach Bioinformatics 24 1344-1348 2008

Maeda M and Kinoshita K Development ofnew indices to evaluate protein-protein inter-faces Assembling space volume assembling

space distance and global shape descriptor JMol Graph Mod 27 706-711 2009

Miura K Toh H Hirakawa H Sugii M Mu-rata M Nakai K Tashiro K Kuhara SAzuma Y and Shirai M Genome-wideanalysis of Chlamydophila pneumoniae gene ex-pression at the late stage of infection DNARes 15 (2) 83-91 2008

Murakami K Imanishi T Gojobori T andNakai K Two different classes of co-occurring motif pairs found by a novel visu-alization method in human promoter regionsBMC Genomics 9 (1) 112 2008

Nishida K Frith M and Nakai K Pseudo-counts for transcription factor binding sitesNucl Acids Res 37 939-944 2009 publishedonline on December 23 2008

Obayashi T Hayashi S Shibaoka M SaekiM Ohta H and Kinoshita K COXPRESdb adatabase of coexpressed gene networks inmammals Nucleic Acids Res 36 D77-82 2008

Obayashi T Hayashi S Saeki M Ohta Hand Kinoshita K ATTED-II provides coex-pressed gene networks for Arabidopsis Nu-cleic Acids Res 37 D987-991 2009

Okamura K and Nakai K Retrotranspositionas a source of new promoters Mol Biol Evol 25 (6) 1231-1238 2008

Sierro N Makita Y de Hoon M and NakaiK DBTBS a database of transcriptional regu-lation in Bacillus subtilis containing upstreamintergenic conservation information Nucl Ac-ids Res 36 D93-D96 2008

Sierro N Li S Suzuki Y Yamashita R andNakai K Spatial and temporal preferences fortrans-splicing in Ciona intestinalis revealed by

150

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 36: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

EST-based gene expression analysis Gene430 44-49 2009 available online on October21 2008

Shirota M Ishida T and Kinoshita K Effectsof surface-to-volume ratio of proteins on hy-drophilic residues decrease in occurrence andincrease in buried fraction Protein Sci 171596-1602 2008

Tsuchihara K Suzuki Y Wakaguri H IrieT Tanimoto K Hashimoto S MatsushimaK Mizushima-Sugano J Yamashita RNakai K Bentley D Esumi H and SuganoS Massive transcriptional start site analysis ofhuman genes in hypoxia cells Nucl Acids Resin press

Tsuchiya Y Nakamura H and Kinoshita KDiscrimination between biological interfacesand crystal-packing contacts Compt Biol Chem 1 99-113 2008

Vandenbon A Miyamoto Y Takimoto NKusakabe T and Nakai K Markov chain-based promoter structure modeling for tissue-specific expression pattern prediction DNARes 15 (1) 3-11 2008

Vandenbon A and Nakai K Using simplerules on presence and positioning of motifsfor promoter structure modeling and tissuespecific expression prediction Genome Infor-matics Edited by Arthur J and Ng S-K (Im-

perial College Press London) vol 21 pp 188-199 2008

Wakaguri H Yamashita R Suzuki YSugano S and Nakai K DBTSS DataBase ofTranscription Start Sites progress report 2008Nucl Acids Res 36 D97-D101 2008

Yamashita R Suzuki Y Takeuchi N Wak-aguri H Ueda T Sugano S and Nakai KComprehensive detection of human terminaloligo-pyrimidine (TOP) gene and analysis oftheir characteristics Nucl Acids Res 36 (11)3707-3715 2008

Kinoshita K Kono H and Yura K Predictionof molecular interactions from 3D-structuresfrom small ligands to large protein complexesEdited by Bujnicki J (Wiley and Sons USA)in printing 2009伊倉貞吉木下賢吾伊藤暢聡ペプチジルプロリルイソメラーゼの構造機能相関蛋白質核酸酵素54167―1722009木下賢吾立体構造からのタンパク質機能予測現状と展望遺伝子医学MOOK14号in press中井謙太ポールホートン第3章 3アミノ酸配列に基づくタンパク質の細胞内局在予測実験医学増刊 vol261106―11122008中井謙太タンパク質のシステム生物学猪飼伏見卜部上野川中村浜窪編タンパク質の事典朝倉書店575―5782008

151

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 37: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

Department of Public Policy works for three major missions public policy studieson translational research its application to healthcare and its impact on social se-curity practical advices and survey for research projects to build public trust andldquominority-centeredrdquo scientific communication We have conducted a comparativepolitical study on stem cell research regarding homecare services for ALS in EastAsia We also supported for ldquoBioBank Japanrdquo project from ethical legal and socialstandpoints and ended the first questionnaire survey We held SciArt Cafeacute twiceat the Medical Science Museum as one of the outreach activities

1 A comparative political study on stem cellresearch and genetic testing in East Asia

Supported by Japan Bioindustry Associationwe conducted a comparative study on researchpolicy on stem cells to examine broader socialand cultural agendas on industrialization ofstem cell research and genetic testing Wersquove in-terviewed main players in this area the relevantauthorities bioindustry CEOs physicians aca-demics and patients support groups We alsoconducted literature reviews regarding regula-tions One of the key preliminary findings is thecontrary regulative differences between SouthKorea and Japan After the fabrication of HwangWoo-sukrsquos stem cell cloning and unethical hu-man egg collection bioethics law has been re-vised and the government seeks more strictregulation towards life science and healthcareWersquove found some correlations in political op-tions on stem cell research and genetic testing interms of regulations among in East Asia

2 Establishment of Office of Research Ethics(ORE)

Under the Deanrsquos courageous decision theIMSUT have established the Office of ResearchEthics (ORE) for supporting research activitiesOur department has main responsibility formanaging the ORE and our research ethics re-view system supported by Professor Hiroshi Ki-yono of Division of Mucosal Immunology Pro-fessor Kensuke Miyake of Division of InfectiousGenetics Professor Fumitaka Nagamura and DrMakiko Tajima of Department of Clinical TrialSafety Management Professor Yasushi Kodamaof Graduate School of Public Policy and Profes-sor Akira Akabayashi of Graduate School ofMedicine After conducting our survey on pastethical reviews and a comparative study on re-search ethics review system in the US the UKand South Korea we checked our current prob-lems which tend to stuck fluent research reviewprocess so as to secure quality assurance of ethi-cal discussions Since February 3rd of 2009 Ay-ako Kamisato has assumed main responsibilityon ldquobench consultingrdquo regarding consent re-search protocols and pre-review on research eth-ics of all research involving human subjects Wewill start communication with other relevant di-visions on research ethics review founded by re-

Human Genome Center

Department of Public Policy公共政策研究分野

Associate Professor Kaori Muto PhDProject Assistant Professor Hyongoo Hong PhDProject Assistant Professor Ayako Kamisato

准 教 授 保健学博士 武 藤 香 織特任助教 学術博士 洪 賢 秀特任助教 法学修士 神 里 彩 子

152

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 38: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

search institutes and prepare for new study onresearch ethics review and ethical governancefor future

3 Ethical legal and social support for ldquoBio-Bank Japanrdquo project

For supporting ldquoBioBank Japanrdquo project ledby Professor Yusuke Nakamura of Laboratory ofMolecular Medicine of IMSUT wersquove conductedthree types of surveys and issued newslettersfor participants By the end of 2007 the projecthas obtained 200000 written consent forms byresearch coordinators called Medical Coordina-tors (MC) The project trained nurses or phar-macists as MCs for obtaining free and fully in-formed consent from participants We con-ducted our questionnaire survey to participantsof the BioBank Japan Project Our data showsthat the younger participants thought that theirpersonal analyzed data should be disclosed Theconsent process had been well-worked out inadvance and is fully complied with the govern-ment ethical guidelines for geneticgenomic re-search However recent publications show thatthe long and tedious consent process may notcontribute to participantsrsquo understanding theoverview of the research may be unethicalrather than ethical If we long for ldquopersonalizedmedicinerdquo we should think further about theconstruction of ldquopersonalized consent processrdquoand we have to change the relationship betweenparticipants and researchers from one-time in-formed consent to long lasting public trust

Obtaining feedbacks from participants is alsoeffective to keep incentives for participation andprevent dropout of participants from researchprocess We conducted three kinds of surveys toevaluate and improve the consent process andexplore what the project should do for public in-volvement questionnaire surveys towards re-search participants a web-based questionnairesurvey towards all MCs and focus group inter-views with chief MCs to triangulate the consentprocess The preliminary results show that par-ticipants are basically satisfied with the consentprocess and highly evaluate MCsrsquo attitudes to-wards them Most MCs also responded thatthey have made their original efforts to maketheir explanation easier and understandable spe-cifically towards the elderly However certainamounts of participants have already forgottenabout what for they have donated their DNA

and serums and the experience of watching theDVD or the leaflet about the project overviewWersquove found that participants who respondedthat they had forgotten the whole consent proc-ess are not the elderly population FurthermoreMCs explains that this project doesnrsquot have anyplans to disclose personal genotyped data toeach participant but a certain amount of partici-pants responded that they now want to see theirown genotyped data or tentative research feed-backs while others are just satisfied with theircontribution to genomic research without anyrewards Even though participants should forgetthe fact that they gave consent for researchMCs explain encourage and appreciate partici-pants at each time and participants recall theirwill for contribution

To appreciate participantsrsquo and MCsrsquo contri-bution to the project we had issued ldquoBioBanknewslettersrdquo three times in 2007 for MCs andparticipants We will explore more methods andopportunities to communicate with participantsBecause the current forms of BioBank newslet-ters are available only for the sighted with goodeyesight we make efforts for personalized infor-mation security to meet with disabilities of par-ticipants

4 SciArt Cafeacute

According to the 3rd Science and TechnologyBasic Plan (FY2006-FY2010) outreach activitiesare promoted that aim for the sharing of publicneeds through interactive communication be-tween researchers and the public As one ofsuch outreach activities we held our originalscience cafeacute series called as ldquoSciArt Cafeacuterdquo twicein 2008 Our original intent of ldquoSciArt Cafeacuterdquo isto promote communication between scientistsand those who donrsquot have regular communica-tion with science but love art The 1st sessioncalled ldquoRhythm generated by networkrdquo washeld in Shibuya during the 3rd World RhythmSummit supported by Dr Atsuko Takamatsu(Waseda Univ) Dr Shin-ichi Nakagawa(RIKEN) and Dr Hideaki Takeuchi (UT) The 2nd

session called ldquoDoing science doing artrdquo washeld on October 8th at the Medical Science Mu-seum in the IMSUT supported by Dr HideoIwasaki (Waseda Univ) and Dr Yoichiro Mu-rakami (JST) We prepare for the 3rd session innext early summer 2009

Publications

1 Ishiyama I Nagai A Muto K Tamakoshi AKokado M Mimura K Tanzawa T Yama-

gata Z Relationship between Public Atti-tudes toward Genomic Studies Related to

153

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154

Page 39: Human Genome Center Laboratory of Genome Database … · 2020-06-02 · Cluster) database. We built a system that per-forms automatic update of the ortholog cluster, which can be

Medicine and Their Level of Genomic Liter-acy in Japan American Journal of MedicalGenetics 146A (13) 696-706 2008

2 洪賢秀韓国社会における子どもの「性保護」と性犯罪防止対策比較法研究70号2009印刷中

3 神里彩子成澤光編著生殖補助医療 生命倫理と法―基本資料集3信山社21―123262―3082008

4 張瓊方諸外国における生殖補助医療の規制状況と実施状況(台湾)生殖補助医療 生命倫理と法―基本資料集3神里彩子成澤光編信山社323―3342008

5 大上泰弘神里彩子城山英明イギリス及びアメリカにおける動物実験規制の比較分析―日本の規制体制への示唆社会技術研究論文集5号132―1422008

6 大上泰弘成廣孝神里彩子城山英明打越綾子日本における生命科学技術者の動物実験に関する意識―生命科学実験及び動物慰霊祭に関するアンケート調査の分析ヒトと動物の関係学会誌20号66―732008

7 大上泰弘神里彩子城山英明イギリスにおける動物の実験規制を支えている思考様式科学技術社会論研究5号84―922008

8渡部麻衣子上田昌文人の必要を充足する科学技術福祉工学における開発現場の分析科学技術社会研究138―1512008

9武藤香織「脱医療化」する予測的な遺伝学的検査への日米の対応―遺伝病から栄養遺伝

学的検査まで―日米の医療―制度と倫理杉田米行編大阪大学出版会203―2242008

10武藤香織DNA親子鑑定は「ふしだらな」女性にとっての救済策かジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社238―2642008

11洪賢秀研究用卵子提供の何が問題なのか―韓国黄禹錫論文捏造事件を中心に―ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社196―2142008

12張瓊方生殖技術と台湾社会ジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社215―2222008

13三村恭子小門穂武藤香織張瓊方洪賢秀柘植あづみ女性にやさしい機械のつくられ方―内診台を例にしてジェンダー研究のフロンティア第4巻 テクノバイオポリティクス―科学医療技術のいま舘かおる編作品社223―2402008

14神里彩子生殖補助医療をめぐる議論―その回顧と展望―家永登編『生殖技術と家族』早稲田大学出版部42―712008

15渡部麻衣子上田昌文編訳エンハンスメント論争身体精神の増強と先端科学技術社会評論社2008

154