Click here to load reader


  • View

  • Download

Embed Size (px)




    Proceedings of the 2003 Conference

    edited by

    Sebastian Thrun, Lawrence K. Saul, and Bernhard Schlkopf

    A Bradford BookThe MIT Press

    Cambridge, MassachusettsLondon, England

  • Contents

    Preface xvii

    NIPS Committees xxiReviewers xxiii

    Part I Algorithms and Architectures

    Efficient Multiscale Sampling from Products of Gaussian Mixtures

    Alexander T. Ihler, Erik B. Sudderth, William T. Freeman, Alan S. Willsky 1

    Simplicial Mixtures of Markov Chains: Distributed Modelling of Dynamic User ProfilesMark Girolami, Ata Kaban 9

    Hierarchical Topic Models and the Nested Chinese Restaurant ProcessDavid Blei, Thomas L. Griffiths, Michael I. Jordan, Joshua B. Tenenbaum 17

    Max-Margin Markov NetworksBen Taskar, Carlos Guestrin, Daphne Koller 25

    Invariant Pattern Recognition by Semi-Definite Programming MachinesThore Graepel, Ralf Herbrich 33

    Learning a Distance Metric from Relative ComparisonsMatthew Schultz, Thorsten Joachims 41

    1-norm Support Vector MachinesJi Zhu, Saharon Rosset, Trevor Hastie, Rob Tibshirani 49

    Image Reconstruction by Linear Programming

    Koji Tsuda, Gunnar Rtsch 57

    Multiple-Instance Learning via Disjunctive Programming BoostingStuart Andrews, Thomas Hofmann 65

    Convex Methods for TransductionTijl De Bie, Nello Cristianini 73

    Kernel Dimensionality Reduction for Supervised Learning

    Kenji Fukumizu, Francis R. Bach, Michael I. Jordan 81

    Clustering with the Connectivity KernelBernd Fischer, Volker Roth, Joachim M. Buhmann 89

    Efficient and Robust Feature Extraction by Maximum Margin CriterionHaifeng Li, Tao Jiang, Keshu Zhang 97

    Sparse Greedy Minimax Probability Machine Classification

    Thomas Strohmann, Andrei Belitski, Greg Grudic, Dennis DeCoste 105

    Sequential Bayesian Kernel RegressionJaco Vermaak, Simon J. Godsill, Arnaud Doucet 113

    Fast Feature Selection from Microarray Expression Data via Multiplicative Large MarginAlgorithmsClaudio Gentile 121

  • Dynamical Modeling with Kernels for Nonlinear Time Series Prediction

    Liva Ralaivola, Florence d'Alch-Buc 129

    Extreme Components Analysis

    Max Welling, Felix Agakov, Christopher K. I. Williams 137

    Linear Dependent Dimensionality Reduction

    Nathan Srebro, Tommi S. Jaakkola 145

    Locality Preserving Projections

    Xiaofei He, Partha Niyogi 153

    Optimal Manifold Representation of Data: An Information Theoretic Approach

    Denis V. Chigirev, William Bialek 161

    Ranking on Data Manifolds

    Dengyong Zhou, Jason Weston, Arthur Gretton, Olivier Bousquet, Bernhard Schlkopf 169

    Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering

    Yoshua Bengio, Jean-Franois Paiement, Pascal Vincent, Olivier Delalleau,

    Nicolas Le Roux, Marie Ouimet 177

    Pairwise Clustering and Graphical Models

    Noam Shental, Assaf Zomet, Tomer Hertz, Yair Weiss 185

    Tree-structured Approximations by Expectation Propagation

    Thomas Minka, Yuan Qi 193

    The IM Algorithm: A Variational Approach to Information Maximization

    David Barber, Felix Agakov 201

    Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit

    Sparse Hessian-Vector Multiply

    Eiji Mizutani, James W. Demmel 209

    Large Scale Online Learning

    Lon Bottou, Yann Le Cun 217

    Online Classification on a Budget

    Koby Crammer, Jaz Kandola, Yoram Singer 225

    Online Learning via Global Feedback for Phrase Recognition

    Xavier Carreras, Lluis Marquez 233

    Sparse Representation and Its Applications in Blind Source Separation

    Yuanqing Li, Andrzej Cichocki, Shun-ichi Amari, Sergei Shishkin, Jianting Cao, Fanji Gu . . 241

    Perspectives on Sparse Bayesian Learning

    David Wipf, Jason Palmer, Bhaskar Rao 249

    Semi-Supervised Learning with Trees

    Charles Kemp, Thomas L. Griffiths, Sean Stromsten, Joshua . Tenenbaum 257

    Efficient Exact k-NN and Nonparametric Classification in High Dimensions

    Ting Liu, Andrew W. Moore, Alexander Gray 265

    Nonstationary Covariance Functions for Gaussian Process Regression

    Christopher J. Paciorek, Mark J. Schervish 273

    Learning the k in k-means

    Greg Hamerly, Charles Elkan 281

  • Finding the M Most Probable Configurations in Arbitrary Graphical ModelsChen Yanover, Yair Weiss 289

    Non-linear CCA and PCA by Alignment of Local Models

    Jakob J. Verbeek, Sam T. Roweis, Nikos Vlassis 297

    Learning Spectral ClusteringFrancis R. Bach, Michael I. Jordan 305

    AUC Optimization vs. Error Rate MinimizationCorinna Cortes, Mehryar Mohri 313

    Learning with Local and Global Consistency

    Dengyong Zhou, Olivier Bousquet, Thomas Navin Lal, Jason Weston, Bernhard Schlkopf. . 321

    Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data

    Neil D. Lawrence 329

    Warped Gaussian ProcessesEdward Snelson, Carl Edward Rasmussen, Zoubin Ghahramani 337

    Can We Learn to Beat the Best StockAllan Borodin, Ran El-Yaniv, Vincent Gogan 345

    Approximate Expectation MaximizationTom Heskes, Onno Zoeter, Wim Wiegerinck 353

    Linear Response for Approximate Inference

    Max Welling, Yee Whye Teh 361

    Semidefinite Relaxations for Approximate inference on Graphs with CyclesMartin Wainwright, Michael I. Jordan 369

    Approximability of Probability Distributions

    Alina Beygelzimer, Irina Rish 377

    Denoising and Untangling Graphs Using Degree Priors

    Quaid D. Morris, Brendan J. Frey 385

    On the Concentration of Expectation and Approximate inference in Layered NetworksXuanLong Nguyen, Michael I. Jordan 393

    Inferring State Sequences far Non-linear Systems with Embedded Hidden Markov ModelsRadford M. Neal, Matthew J. Beal, Sam T. Roweis 401

    Fast Algorithms far Large-State-Space HMMs with Applications to Web Usage AnalysisPedro F. Felzenszwalb, Daniel P. Huttenlocher, Jon M. Kleinberg 409

    Wormholes Improve Contrastive DivergenceGeoffrey Hinton, Max Welling, Andriy Mnih 417

    Sample PropagationMark A. Paskin 425

    Generalised Propagation far Fast Fourier Transforms with Partial or Missing Data

    Amos J Storkey 433

    Laplace PropagationAlexander Smola, Vishy Vishwanathan, Eleazar Eskin 441

    Learning to Find Pre-ImagesGoekhan H. Bakir, Jason Weston, Bernhard Schlkopf 449

  • Semi-Definite Programming by Perceptron LearningThore Graepel, Ralf Herbrich, Andriy Kharechko, John Shawe-Taylor 457

    Computing Gaussian Mixture Models with EM Using Equivalence Constraints

    Noam Shental, Aharon Bar-Hillel, Tomer Hertz, Daphna Weinshall 465

    Feature Selection in Clustering Problems

    Volker Roth, Tilman Lange 473

    An Iterative Improvement Procedure for Hierarchical ClusteringDavid Kauchak, Sanjoy Dasgupta 481

    Identifying Structure across re-partitioned DataZvika Marx, Ido Dagan, Eli Shamir 489

    Log-Linear Models for Label Ranking

    Ofer Dekel, Christopher Manning, Yoram Singer 497

    Minimax Embeddings

    Matthew Brand 505

    No Unbiased Estimator of the Variance of K-Fold Cross-Validation

    Yoshua Bengio, Yves Grandvalet 513

    Bias-Corrected Bootstrap and Model Uncertainty

    Harald Steck, Tommi S. Jaakkola 521

    Probability Estimates for Multi-Class Classification by Pairwise Coupling

    Ting-Fan Wu, Chih-Jen Lin, Ruby C. Weng 529

    Necessary Intransitive Likelihood-Ratio Classifiers

    Gang Ji, Jeff Bilmes 537

    Classification with Hybrid Generative/Discriminative Models

    Rajat Raina, Yirong Shen, Andrew Y. Ng, Andrew McCallum 545

    A Model for Learning the Semantics of PicturesVictor Lavrenko, R. Manmatha, Jiwoon Jeon 553

    Algorithms for Interdependent Security Games

    Michael Kearns, Luis Ortiz 561

    Part II Applications

    Fast Embedding of Sparse Similarity Graphs

    John C. Platt 571

    GPPS: A Gaussian Process Positioning System for Cellular Networks

    Anton Schwaighofer, Marian Grigoras, Volker Tresp, Clemens Hoffmann 579

    An Autonomous Robotic System for Mapping Abandoned Mines

    David Ferguson, Aaron Morris, Dirk Hhnel, Christopher Baker, Zachary Omohundro,Carlos Reverte, Scott Thayer, Charles Whittaker, William Whittaker, Wolfram Burgard,Sebastian Thrun 587

    Semi-supervised Protein Classification Using Cluster KernelsJason Weston, Christina Leslie, Dengyong Zhou, Andr Elisseeff, William S. Noble 595

    Statistical Debugging of Sampled ProgramsAlice X. Zheng, Michael I. Jordan, Ben Liblit, Alex Aiken 603

  • Markov Models for Automated ECG Interval Analysis

    Nicholas P. Hughes, Lionel Tarassenko, Stephen J. Roberts 611

    Parameterized Novelty Detectors for Environmental Sensor Monitoring

    Cynthia Archer, Todd . Leen, Antonio Baptista 619

    Modeling User Rating Profiles For Collaborative Filtering

    Benjamin Marlin 627

    Application of SVMs for Colour Classification and Collision Detection with AIBO Robots

    Michael J. Quinlan, Stephan . Chalup, Richard H. Middleton 635

    Kernels for Structured Natural Language Data

    Jun Suzuki, Yutaka Sasaki, Eisaku Maeda 643

    A Fast Multi-Resolution Method for Detection of Significant Spatial Disease Clusters

    Daniel B. Neill, Andrew W. Moore 651

    Link Prediction in Relational Data

    Ben Taskar, Ming-Fai Wong, Pieter Abbeel, Daphne Koller 659

    Unsupervised Color Decomposition Of Histologically Stained Tissue Samples

    Andrew Rabinovich, Sameer Agarwal, Casey Laris, Jeffrey H. Price, Serge J. Belongie 667

    ICA-based Clustering of Genes from Microarray Expression Data

    Su-In Lee, Serafim Batzoglou 675

    Gene Expression Clustering with Functional Mixture M

Search related