Upload
others
View
20
Download
0
Embed Size (px)
Citation preview
Lecture Notes in Artificial Intelligence 4701Edited by J. G. Carbonell and J. Siekmann
Subseries of Lecture Notes in Computer Science
Joost N. Kok Jacek KoronackiRamon Lopez de Mantaras Stan MatwinDunja Mladenic Andrzej Skowron (Eds.)
Machine Learning:ECML 2007
18th European Conference on Machine LearningWarsaw, Poland, September 17-21, 2007Proceedings
13
Series Editors
Jaime G. Carbonell, Carnegie Mellon University, Pittsburgh, PA, USAJörg Siekmann, University of Saarland, Saarbrücken, Germany
Volume Editors
Joost N. KokLeiden University, The NetherlandsE-mail: [email protected]
Jacek KoronackiPolish Academy of Sciences, Warsaw, PolandE-mail: [email protected]
Ramon Lopez de MantarasSpanish National Research Council (CSIC), Bellaterra, SpainE-mail: [email protected]
Stan MatwinUniversity of Ottawa, CanadaE-mail: [email protected]
Dunja MladenicJožef Stefan Institute, Ljubljana, SloveniaE-mail: [email protected]
Andrzej SkowronWarsaw University, PolandE-mail: [email protected]
Library of Congress Control Number: 2007934766
CR Subject Classification (1998): I.2, F.2.2, F.4.1, H.2.8
LNCS Sublibrary: SL 7 – Artificial Intelligence
ISSN 0302-9743ISBN-10 3-540-74957-8 Springer Berlin Heidelberg New YorkISBN-13 978-3-540-74957-8 Springer Berlin Heidelberg New York
This work is subject to copyright. All rights are reserved, whether the whole or part of the material isconcerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publicationor parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,in its current version, and permission for use must always be obtained from Springer. Violations are liableto prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
springer.com
© Springer-Verlag Berlin Heidelberg 2007Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, IndiaPrinted on acid-free paper SPIN: 12124169 06/3180 5 4 3 2 1 0
Preface
The two premier annual European conferences in the areas of machine learningand data mining have been collocated ever since the first joint conference inFreiburg, 2001. The European Conference on Machine Learning (ECML) tracesits origins to 1986, when the first European Working Session on Learning washeld in Orsay, France. The European Conference on Principles and Practice ofKnowledge Discovery in Databases (PKDD) was first held in 1997 in Trondheim,Norway. Over the years, the ECML/PKDD series has evolved into one of thelargest and most selective international conferences in machine learning anddata mining. In 2007, the seventh collocated ECML/PKDD took place duringSeptember 17–21 on the central campus of Warsaw University and in the nearbyStaszic Palace of the Polish Academy of Sciences.
The conference for the third time used a hierarchical reviewing process. Wenominated 30 Area Chairs, each of them responsible for one sub-field or severalclosely related research topics. Suitable areas were selected on the basis of thesubmission statistics for ECML/PKDD 2006 and for last year’s InternationalConference on Machine Learning (ICML 2006) to ensure a proper load balanceamong the Area Chairs. A joint Program Committee (PC) was nominated for thetwo conferences, consisting of some 300 renowned researchers, mostly proposedby the Area Chairs. This joint PC, the largest of the series to date, allowed usto exploit synergies and deal competently with topic overlaps between ECMLand PKDD.
ECML/PKDD 2007 received 592 abstract submissions. As in previous years,to assist the reviewers and the Area Chairs in their final recommendation authorshad the opportunity to communicate their feedback after the reviewing phase.For a small number of conditionally accepted papers, authors were asked tocarry out minor revisions subject to the final acceptance by the Area Chairresponsible for their submission. With very few exceptions, every full submissionwas reviewed by three PC members. Based on these reviews, on feedback fromthe authors, and on discussions among the reviewers, the Area Chairs provideda recommendation for each paper. The four Program Chairs made the finalprogram decisions following a 2-day meeting in Warsaw in June 2007. Continuingthe tradition of previous events in the series, we accepted full papers with anoral presentation and short papers with a poster presentation. We selected 41 fullpapers and 37 short papers for ECML, and 28 full papers and 35 short papers forPKDD. The acceptance rate for full papers is 11.6% and the overall acceptancerate is 23.8%, in accordance with the high-quality standards of the conferenceseries. Besides the paper and poster sessions, ECML/PKDD 2007 also featured12 workshops, seven tutorials, the ECML/PKDD Discovery Challenge, and theIndustrial Track.
VI Preface
An excellent slate of Invited Speakers is another strong point of the conferenceprogram. We are grateful to Ricardo Bazea-Yates (Yahoo! Research Barcelona),Peter Flach (University of Bristol), Tom Mitchell (Carnegie Mellon Universi-ty), and Barry Smyth (University College Dublin) for their participation inECML/PKDD 2007. The abstracts of their presentations are included in thisvolume.
We distinguished four outstanding contributions; the awards were generouslysponsored by the Machine Learning Journal and the KD-Ubiq network.
ECML Best Paper: Angela Kimming, Luc De Raedt and Hannu Toivonen:“Probabilistic Explanation-Based Learning”
PKDD Best Paper: Toon Calders and Szymon Jaroszewicz: “Efficient AUC-Optimization for Classification”
ECML Best Student Paper: Daria Sorokina, Rich Caruana, and Mirek Rie-dewald: “Additive Groves of Regression Trees”
PKDD Best Student Paper: Dikan Xing, Wenyuan Dai, Gui-Rong Xue, andYong Yu: “Bridged Refinement for Transfer Learning”
This year we introduced the Industrial Track chaired by Florence d’Alche-Buc(Universite d’Evry-Val d’Essonne) and Marko Grobelnik (Jozef Stefan Institute,Slovenia) consisting of selected talks with a strong industrial component presen-ting research from the area covered by the ECML/PKDD conference.
For the first time in the history of ECML/PKDD, the conference procee-dings were available on-line to conference participants during the conference.We are grateful to Springer for accommodating this new access channel for theproceedings. Inspired by some related conferences (ICML, KDD, ISWC) we in-troduced videorecording, as we would like to save at least the invited talks andpresentations of award papers for the community and make them accessible athttp://videolectures.net/.
This year’s Discovery Challenge was devoted to three problems: user beha-vior prediction from Web traffic logs, HTTP traffic classification, and Sumerianliterature understanding. The Challenge was co-organized by Piotr Ejdys (Gemi-us SA), Hung Son Nguyen (Warsaw University), Pascal Poncelet (EMA-LGI2P)and Jerzy Tyszkiewicz (Warsaw University); 122 teams participated. For thefirst task, the three finalists were:
Malik Tahir Hassan, Khurum Nazir Junejo and Asim Karim from Lahore Uni-versity, Pakistan
Krzysztof Dembczynski and Wojciech Kot�lowski from Poznan University ofTechnology, Poland and Marcin Sydow from Polish-Japanese Institute ofInformation Technology, Poland
Tung-Ying Lee from National Tsing Hua University, Taiwan
Results for the other Discovery Challenge tasks were not available at the timethe proceedings were finalized, but were announced at the conference.
We are all indebted to the Area Chairs, Program Committee members andexternal reviewers for their commitment and hard work that resulted in a rich
Preface VII
but selective scientific program for ECML/PKDD. We are particularly gratefulto those reviewers who helped with additional reviews at very short notice toassist us in a small number of difficult decisions. We further thank our Workshopand Tutorial Chairs Marzena Kryszkiewicz (Warsaw Technical University) andJan Rauch (University of Economics, Prague) for selecting and coordinating the12 workshops and seven tutorial events that accompanied the conference; theworkshop organizers, tutorial presenters, and the organizers of the DiscoveryChallenge and the Industrial track; Richard van de Stadt and CyberChairPROfor competent and flexible support; Warsaw University and the Polish Academyof Sciences (Institute of Computer Science) for their local and organizationalsupport. Special thanks are due to the Local Chair, Marcin Szczuka, WarsawUniversity (assisted by Michal Ciesio�lka from the Polish Academy of Sciences)for the many hours spent making sure that all the details came together to ensurethe success of the conference. Finally, we are grateful to the Steering Committeeand the ECML/PKDD community that entrusted us with the organization ofthe ECML/PKDD 2007.
Most of all, however, we would like to thank all the authors who trusted uswith their submissions, thereby contributing to the one of the main yearly eventsin the life of our vibrant research community.
September 2007 Joost Kok (PKDD Program co-Chair)Jacek Koronacki (General Chair)
Ramon Lopez de Mantaras (General Chair)Stan Matwin (ECML Program co-Chair)
Dunja Mladenic (ECML Program co-Chair)Andrzej Skowron (PKDD Program co-Chair)
Organization
General Chairs
Ramon Lopez de Mantaras (Spanish Council for Scientific Research)Jacek Koronacki (Polish Academy of Sciences)
Program Chairs
Joost N. Kok (Leiden University)Stan Matwin (University of Ottawa and Polish Academy of Sciences)Dunja Mladenic (Jozef Stefan Institute)Andrzej Skowron (Warsaw University)
Local Chairs
Micha�l Ciesio�lka (Polish Academy of Sciences)Marcin Szczuka (Warsaw University)
Tutorial Chair
Jan Rauch (University of Economics, Prague)
Workshop Chair
Marzena Kryszkiewicz (Warsaw University of Technology)
Discovery Challenge Chair
Hung Son Nguyen (Warsaw University)
Industrial Track Chairs
Florence d’Alche-Buc (Universite d’Evry-Val d’Essonne)Marko Grobelnik (Jozef Stefan Institute)
X Organization
Steering Committee
Jean-Francois Boulicaut Pavel BrazdilRui Camacho Floriana EspositoJohannes Furnkranz Joao GamaFosca Gianotti Alıpio JorgeDino Pedreschi Tobias SchefferMyra Spiliopoulou Luıs Torgo
Area Chairs
Michael R. Berthold Hendrik BlockeelOlivier Chapelle James CussensKurt Driessens Peter FlachEibe Frank Johannes FurnkranzThomas Gartner Joao GamaRayid Ghani Jerzy Grzymala-BusseEamonn Keogh Kristian KerstingMieczys�law A. K�lopotek Stefan KramerPedro Larranaga Claire NedellecAndreas Nurnberger George PaliourasBernhard Pfahringer Enric PlazaLuc De Raedt Tobias SchefferGiovanni Semeraro W�ladys�law SkarbekMyra Spiliopoulou Hannu ToivonenLuıs Torgo Paul Utgoff
Program Committee
Charu C. AggarwalJesus Aguilar-RuizDavid W. AhaNahla Ben AmorSarabjot Singh AnandAnnalisa AppiceJosep-Lluis ArcosWalid G. ArefEva ArmengolAnthony J. BagnallAntonio BahamondeSugato BasuBettina BerendtFrancesco BergadanoRalph BergmannSteffen Bickel
Concha BielzaMikhail BilenkoFrancesco BonchiGianluca BontempiChristian BorgeltKarsten M. BorgwardtDaniel BorrajoAntal van den BoschHenrik BostromMarco BottaJean-Francois BoulicautJanez BrankThorsten BrantsUlf BrefeldCarla E. BrodleyPaul Buitelaar
Organization XI
Toon CaldersLuis M. de CamposNicola CanceddaClaudio CarpinetoJesus CerquidesKaushik ChakrabartiChien-Chung ChanAmanda ClareIra CohenFabrizio CostaSusan CrawBruno CremilleuxTom CroonenborghsJuan Carlos CuberoPadraig CunninghamAndrzej CzyzewskiWalter DaelemansIan DavidsonMarco DegemmisOlivier DelalleauJitender S. DeogunMarcin DetynieckiBelen Diaz-AgudoChris H.Q. DingCarlotta DomeniconiMarek J. DruzdzelSaso DzeroskiTina Eliassi-RadTapio ElomaaAbolfazl Fazel FamiliWei FanAd FeeldersAlan FernGeorge FormanLinda C. van der GaagPatrick GallinariJose A. GamezAlex GammermanMinos N. GarofalakisGemma C. GarrigaEric GaussierPierre GeurtsFosca GianottiAttilio GiordanaRobert L. Givan
Bart GoethalsElisabet GolobardesPedro A. Gonzalez-CaleroMarko GrobelnikDimitrios GunopulosMaria HalkidiMark HallMatthias HeinJose Hernandez-OralloColin de la HigueraMelanie HilarioShoji HiranoTu-Bao HoJaakko HollmenGeoffrey HolmesFrank HoppnerTamas HorvathAndreas HothoJiayuan HuangEyke HullemeierMasahiro InuiguchiInaki InzaManfred JaegerSzymon JaroszewiczRosie JonesEdwin D. de JongAlıpio Mario JorgeTamer KahveciAlexandros KalousisHillol KarguptaAndreas KarwathGeorge KarypisSamuel KaskiDimitar KazakovRoss D. KingFrank KlawonnRalf KlinkenbergGeorge KolliosIgor KononenkoBozena KostekWalter A. KostersMiroslav KubatHalina KwasnickaJames T. KwokNicolas Lachiche
XII Organization
Michail G. LagoudakisNiels LandwehrPedro LarranagaPavel LaskovMark LastDominique LaurentNada LavracQuoc V. LeGuy LebanonUlf LeserJure LeskovecJessica LinFrancesca A. LisiPasquale LopsJose A. LozanoPeter LucasRichard MaclinDonato MalerbaNikos MamoulisSuresh ManandharStephane Marchand-MailletElena MarchioriLluis MarquezYuji MatsumotoMichael MayMike MayoThorsten MeinlPrem MelvilleRosa MeoTaneli MielikainenBamshad MobasherSerafın MoralKatharina MorikHiroshi MotodaToshinori MunakataIon MusleaOlfa NasraouiJennifer NevilleSiegfried NijssenJoakim NivreAnn NoweArlindo L. OliveiraSanti OntanonMiles OsborneMartijn van Otterlo
David PageSpiros PapadimitriouSrinivasan ParthasarathyAndrea PasseriniJose M. PenaLourdes Pena CastilloJose M. Pena SanchezJames F. PetersJohann PetrakLech PolkowskiHan La PoutrePhilippe PreuxKatharina ProbstTapani RaikoAshwin RamSheela RamannaJan RamonZbigniew W. RasChotirat Ann RatanamahatanaFrancesco RicciJohn RiedlChristophe RigottiCeline RobardetVictor RoblesMarko Robnik-SikonjaJuho RousuCeline RouveirolUlrich Ruckert (TU Munchen)Ulrich Ruckert (Univ. Paderborn)Stefan RupingHenryk RybinskiLorenza SaittaHiroshi SakaiRoberto SantanaMartin ScholzMatthias SchubertMichele SebagSandip SenJouni K. SeppanenGalit ShmueliArno SiebesAlejandro SierraVikas SindhwaniArul SiromoneyDominik Slezak
Organization XIII
Carlos SoaresMaarten van SomerenAlvaro SotoAlessandro SperdutiJaideep SrivastavaJerzy StefanowskiDavid J. StracuzziJan StruyfGerd StummeZbigniew SurajEinoshin SuzukiRoman SwiniarskiMarcin SydowPiotr SynakMarcin SzczukaLuis TalaveraMatthew E. TaylorYannis TheodoridisKai Ming TingLjupco TodorovskiVolker TrespShusaku TsumotoKarl TuylsMichalis VazirgiannisKatja VerbeeckJean-Philippe Vert
Michail VlachosHaixun WangJason Tsong-Li WangTakashi WashioGary M. WeissSholom M. WeissShimon WhitesonMarco WieringSlawomir T. WierzchonGraham J. WilliamsStefan WrobelYing YangJingTao YaoYiyu YaoFrancois YvonBianca ZadroznyMohammed J. ZakiGerson ZaveruchaFilip ZeleznyChengXiang ZhaiYi ZhangZhi-Hua ZhouJerry ZhuWojciech ZiarkoAlbrecht Zimmermann
Additional Reviewers
Rezwan AhmedFabio AiolliDima AlbergVassilis AthitsosMaurizio AtzoriAnne AugerPaulo AzevedoPierpaolo BasileMargherita BerardiAndre BergholzMichele BerlingerioKanishka BhaduriKonstantin BiatovJerzy B�laszczynskiGianluca BontempiYann-ael Le Borgne
Zoran BosnicRemco BouckaertAgnes BraudBjoern BringmannEmma ByrneOlivier CaelenRossella CancelliereGiovanna CastellanoMichelangelo CeciHyuk ChoKamalika DasSouptik DattaUwe DickLaura DietzMarcos DominguesHaimonti Dutta
XIV Organization
Marc DymetmanStefan EickelerTimm EulerTanja FalkowskiFernando FernandezFrancisco J. Ferrer-TroyanoCesar FerriDaan FierensBlaz FortunaAlexandre FranciscoMingyan GaoFabian GuizaAnna Lisa GentileAmol N. GhotingArnaud GiacomettiValentin GjorgjioskiRobby GoetschalckxDerek GreenePerry GrootPhilip GrothDaniele GunettiBernd GutmannSattar HashemiYann-Michael De HauwereVera HollinkYi HuangLeo IaquintaAlexander IlinTasadduq ImamTao-Yuan JenFelix JungermannAndrzej KaczmarekBenjamin Haibe KainsJuha KarkkainenRohit KateChris KauffmanArto KlamiJiri KlemaDragi KocevChristine KoernerKevin KontosPetra KraljAnita KrishnakumarMatjaz KukarBrian Kulis
Arnd Christian KonigChristine KornerFei Tony LiuAntonio LaTorreAnne LaurentBaoli LiZi LinBin LiuYan LiuCorrado LoglisciRachel LomaskyCarina LopesChuan LuPierre MaheMarkus MaierGiuseppe MancoIrina MatveevaNicola Di MauroDimitrios MavroeidisStijn MeganckIngo MierswaMirjam MinorAbhilash Alexander MirandaJoao MoreiraSourav MukherjeeCanh Hao NguyenDuc Dung NguyenTuan Trung NguyenJanne NikkilaXia NingBlaz NovakIrene NtoutsiRiccardo OrtaleStanis�law OsinskiKivanc OzonatAline PaesPance PanovThomas Brochmann PedersenMaarten PeetersRuggero PensaXuan-Hieu PhanBenjarath PhoophakdeeAloisio Carlos de PinaChristian PlagemannJose M. Puerta
Organization XV
Aritz PerezChedy RaissiM. Jose Ramirez-QuintanaUmaa RebbapragadaStefan ReckowChiara RensoMatthias RenzFrancois RioultDomingo Rodriguez-BaenaSten SagaertLuka SajnEsin SakaSaeed SalemAntonio SalmeronEerika SaviaAnton SchaeferLeander SchietgatGaetano SciosciaHoward ScordioSven Van SegbroeckIvica SlavkovLarisa SoldatovaArnaud SouletEduardo SpynosaVolkmar SterzingChristof StoermannJiang SuPiotr Szczuko
Alexander TartakovskiOlivier TeytaudMarisa ThomaEufemia TinelliIvan TitovRoberto TrasartiGeorge TsatsaronisKatharina TschumitschewDuygu UcarAntonio VarlaroShankar VembuCeline VensMarcos VieiraPeter VrancxNikil WaleChao WangDongrong WenArkadiusz WojnaYuk Wah WongAdam WoznicaMichael WurstWei XuXintian YangMonika ZakovaLuke ZettlemoyerXueyuan ZhouAlbrecht Zimmermann
Sponsors
We wish to express our gratitude to the sponsors of ECML/PKDD 2007 fortheir essential contribution to the conference. We wish to thank Warsaw Uni-versity, Faculty of Mathematics, Informatics and Mechanics, and Institute ofComputer Science, Polish Academy of Sciences for providing financial and orga-nizational means for the conference; the European Office of Aerospace Researchand Developement, Air Force Office of Scientific Research, United States AirForce Research Laboratory, for their generous financial support.1 KDUbiq Eu-ropean Coordination Action for supporting Poster Reception, Student TravelAwards, and the Best Paper Awards; Pascal European Network of Excellencefor sponsoring the Invited Speaker Program, the Industrial Track and the video-recording of the invited talks and presentations of the four Award Papers; JozefStefan Institute, Slovenia, SEKT European Integrated project and Unilever R& D for their financial support; the Machine Learning Journal for supportingthe Student Best Paper Awards; Gemius S.A. for sponsoring and supportingthe Discovery Challenge. We also wish to express our gratitude to the followingcompanies and institutions that provided us with data and expertise which wereessential components of the Discovery Challenge: Bee Ware, l’Ecole des Minesd’Ales, LIRMM - The Montpellier Laboratory of Computer Science, Robotics,and Microelectronics, and Warsaw University, Faculty of Mathematics, Informa-tics and Mechanics. We also acknowledge the support of LOT Polish Airlines.
1 AFOSR/EOARD support is not intended to express or imply endorsement by theU.S. Federal Government.
Table of Contents
Invited Talks
Learning, Information Extraction and the Web . . . . . . . . . . . . . . . . . . . . . . 1Tom M. Mitchell
Putting Things in Order: On the Fundamental Role of Ranking inClassification and Probability Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Peter A. Flach
Mining Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Ricardo Baeza-Yates
Adventures in Personalized Information Access . . . . . . . . . . . . . . . . . . . . . . 5Barry Smyth
Long Papers
Statistical Debugging Using Latent Topic Models . . . . . . . . . . . . . . . . . . . . 6David Andrzejewski, Anne Mulhern, Ben Liblit, and Xiaojin Zhu
Learning Balls of Strings with Correction Queries . . . . . . . . . . . . . . . . . . . . 18Leonor Becerra Bonache, Colin de la Higuera,Jean-Christophe Janodet, and Frederic Tantini
Neighborhood-Based Local Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30Paul N. Bennett
Approximating Gaussian Processes with H2-Matrices . . . . . . . . . . . . . . . . . 42Steffen Borm and Jochen Garcke
Learning Metrics Between Tree Structured Data: Application to ImageRecognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Laurent Boyer, Amaury Habrard, and Marc Sebban
Shrinkage Estimator for Bayesian Network Parameters . . . . . . . . . . . . . . . . 67John Burge, Terran Lane
Level Learning Set: A Novel Classifier Based on Active ContourModels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Xiongcai Cai and Arcot Sowmya
Learning Partially Observable Markov Models from First PassageTimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Jerome Callut and Pierre Dupont
XX Table of Contents
Context Sensitive Paraphrasing with a Global UnsupervisedClassifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Michael Connor and Dan Roth
Dual Strategy Active Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116Pinar Donmez, Jaime G. Carbonell, and Paul N. Bennett
Decision Tree Instability and Active Learning . . . . . . . . . . . . . . . . . . . . . . . . 128Kenneth Dwyer and Robert Holte
Constraint Selection by Committee: An Ensemble Approach toIdentifying Informative Constraints for Semi-supervised Clustering . . . . . 140
Derek Greene and Padraig Cunningham
The Cost of Learning Directed Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152Thomas Gartner and Gemma C. Garriga
Spectral Clustering and Embedding with Hidden Markov Models . . . . . . 164Tony Jebara, Yingbo Song, and Kapil Thadani
Probabilistic Explanation Based Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 176Angelika Kimmig, Luc De Raedt, and Hannu Toivonen
Graph-Based Domain Mapping for Transfer Learning in GeneralGames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Gregory Kuhlmann and Peter Stone
Learning to Classify Documents with Only a Small Positive TrainingSet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Xiao-Li Li, Bing Liu, and See-Kiong Ng
Structure Learning of Probabilistic Relational Models from IncompleteRelational Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Xiao-Lin Li and Zhi-Hua Zhou
Stability Based Sparse LSI/PCA: Incorporating Feature Selection inLSI and PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Dimitrios Mavroeidis and Michalis Vazirgiannis
Bayesian Substructure Learning - Approximate Learning of Very LargeNetwork Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Andreas Nagele, Mathaus Dejori, and Martin Stetter
Efficient Continuous-Time Reinforcement Learning with AdaptiveState Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Gerhard Neumann, Michael Pfeiffer, and Wolfgang Maass
Source Separation with Gaussian Process Models . . . . . . . . . . . . . . . . . . . . 262Sunho Park and Seungjin Choi
Table of Contents XXI
Discriminative Sequence Labeling by Z-Score Optimization . . . . . . . . . . . . 274Elisa Ricci, Tijl de Bie, and Nello Cristianini
Fast Optimization Methods for L1 Regularization: A ComparativeStudy and Two New Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
Mark Schmidt, Glenn Fung, and Romer Rosales
Bayesian Inference for Sparse Generalized Linear Models . . . . . . . . . . . . . . 298Matthias Seeger, Sebastian Gerwinn, and Matthias Bethge
Classifier Loss Under Metric Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310David B. Skalak, Alexandru Niculescu-Mizil, and Rich Caruana
Additive Groves of Regression Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323Daria Sorokina, Rich Caruana, and Mirek Riedewald
Efficient Computation of Recursive Principal Component Analysis forStructured Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
Alessandro Sperduti
Hinge Rank Loss and the Area Under the ROC Curve . . . . . . . . . . . . . . . . 347Harald Steck
Clustering Trees with Instance Level Constraints . . . . . . . . . . . . . . . . . . . . . 359Jan Struyf and Saso Dzeroski
On Pairwise Naive Bayes Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371Jan-Nikolas Sulzmann, Johannes Furnkranz, and Eyke Hullermeier
Separating Precision and Mean in Dirichlet-Enhanced High-OrderMarkov Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
Rikiya Takahashi
Safe Q-Learning on Complete History Spaces . . . . . . . . . . . . . . . . . . . . . . . . 394Stephan Timmer and Martin Riedmiller
Random k-Labelsets: An Ensemble Method for MultilabelClassification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
Grigorios Tsoumakas and Ioannis Vlahavas
Seeing the Forest Through the Trees: Learning a ComprehensibleModel from an Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
Anneleen Van Assche and Hendrik Blockeel
Avoiding Boosting Overfitting by Removing Confusing Samples . . . . . . . . 430Alexander Vezhnevets and Olga Barinova
Planning and Learning in Environments with Delayed Feedback . . . . . . . . 442Thomas J. Walsh, Ali Nouri, Lihong Li, and Michael L. Littman
XXII Table of Contents
Analyzing Co-training Style Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454Wei Wang and Zhi-Hua Zhou
Policy Gradient Critics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466Daan Wierstra and Jurgen Schmidhuber
An Improved Model Selection Heuristic for AUC . . . . . . . . . . . . . . . . . . . . . 478Shaomin Wu, Peter Flach, and Cesar Ferri
Finding the Right Family: Parent and Child Selection for AveragedOne-Dependence Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 490
Fei Zheng and Geoffrey I. Webb
Short Papers
Stepwise Induction of Multi-target Model Trees . . . . . . . . . . . . . . . . . . . . . . 502Annalisa Appice and Saso Dzeroski
Comparing Rule Measures for Predictive Association Rules . . . . . . . . . . . 510Paulo J. Azevedo and Alıpio M. Jorge
User Oriented Hierarchical Information Organization and Retrieval . . . . . 518Korinna Bade, Marcel Hermkes, and Andreas Nurnberger
Learning a Classifier with Very Few Examples: Analogy Basedand Knowledge Based Generation of New Examples for CharacterRecognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
S. Bayoudh, H. Mouchere, L. Miclet, and E. Anquetil
Weighted Kernel Regression for Predicting Changing Dependencies . . . . . 535Steven Busuttil and Yuri Kalnishkan
Counter-Example Generation-Based One-Class Classification . . . . . . . . . . 543Andras Banhalmi, Andras Kocsor, and Robert Busa-Fekete
Test-Cost Sensitive Classification Based on Conditioned LossFunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
Mumin Cebe and Cigdem Gunduz-Demir
Probabilistic Models for Action-Based Chinese Dependency Parsing . . . . 559Xiangyu Duan, Jun Zhao, and Bo Xu
Learning Directed Probabilistic Logical Models: Ordering-SearchVersus Structure-Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
Daan Fierens, Jan Ramon, Maurice Bruynooghe, andHendrik Blockeel
A Simple Lexicographic Ranker and Probability Estimator . . . . . . . . . . . . 575Peter Flach and Edson Takashi Matsubara
Table of Contents XXIII
On Minimizing the Position Error in Label Ranking . . . . . . . . . . . . . . . . . . 583Eyke Hullermeier and Johannes Furnkranz
On Phase Transitions in Learning Sparse Networks . . . . . . . . . . . . . . . . . . . 591Goele Hollanders, Geert Jan Bex, Marc Gyssens,Ronald L. Westra, and Karl Tuyls
Semi-supervised Collaborative Text Classification . . . . . . . . . . . . . . . . . . . . 600Rong Jin, Ming Wu, and Rahul Sukthankar
Learning from Relevant Tasks Only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608Samuel Kaski and Jaakko Peltonen
An Unsupervised Learning Algorithm for Rank Aggregation . . . . . . . . . . . 616Alexandre Klementiev, Dan Roth, and Kevin Small
Ensembles of Multi-Objective Decision Trees . . . . . . . . . . . . . . . . . . . . . . . . 624Dragi Kocev, Celine Vens, Jan Struyf, and Saso Dzeroski
Kernel-Based Grouping of Histogram Data . . . . . . . . . . . . . . . . . . . . . . . . . . 632Tilman Lange and Joachim M. Buhmann
Active Class Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 640R. Lomasky, C.E. Brodley, M. Aernecke, D. Walt, and M. Friedl
Sequence Labeling with Reinforcement Learning and RankingAlgorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
Francis Maes, Ludovic Denoyer, and Patrick Gallinari
Efficient Pairwise Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658Sang-Hyeun Park and Johannes Furnkranz
Scale-Space Based Weak Regressors for Boosting . . . . . . . . . . . . . . . . . . . . . 666Jin-Hyeong Park and Chandan K. Reddy
K-Means with Large and Noisy Constraint Sets . . . . . . . . . . . . . . . . . . . . . . 674Dan Pelleg and Dorit Baras
Towards ‘Interactive’ Active Learning in Multi-view Feature Sets forInformation Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683
Katharina Probst and Rayid Ghani
Principal Component Analysis for Large Scale Problems with Lots ofMissing Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691
Tapani Raiko, Alexander Ilin, and Juha Karhunen
Transfer Learning in Reinforcement Learning Problems ThroughPartial Policy Recycling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699
Jan Ramon, Kurt Driessens, and Tom Croonenborghs
XXIV Table of Contents
Class Noise Mitigation Through Instance Weighting . . . . . . . . . . . . . . . . . . 708Umaa Rebbapragada and Carla E. Brodley
Optimizing Feature Sets for Structured Data . . . . . . . . . . . . . . . . . . . . . . . . 716Ulrich Ruckert and Stefan Kramer
Roulette Sampling for Cost-Sensitive Learning . . . . . . . . . . . . . . . . . . . . . . . 724Victor S. Sheng and Charles X. Ling
Modeling Highway Traffic Volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 732Tomas Singliar and Milos Hauskrecht
Undercomplete Blind Subspace Deconvolution Via Linear Prediction . . . 740Zoltan Szabo, Barnabas Poczos, and Andras Lorincz
Learning an Outlier-Robust Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 748Jo-Anne Ting, Evangelos Theodorou, and Stefan Schaal
Imitation Learning Using Graphical Models . . . . . . . . . . . . . . . . . . . . . . . . . 757Deepak Verma and Rajesh P.N. Rao
Nondeterministic Discretization of Weights Improves Accuracy ofNeural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 765
Marcin Wojnarski
Semi-definite Manifold Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773Liang Xiong, Fei Wang, and Changshui Zhang
General Solution for Supervised Graph Embedding . . . . . . . . . . . . . . . . . . . 782Qubo You, Nanning Zheng, Shaoyi Du, and Yang Wu
Multi-objective Genetic Programming for Multiple Instance Learning . . . 790Amelia Zafra and Sebastian Ventura
Exploiting Term, Predicate, and Feature Taxonomies inPropositionalization and Propositional Rule Learning . . . . . . . . . . . . . . . . . 798
Monika Zakova and Filip Zelezny
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 807