1
Spring 2018 Graduate Seminar Series Biclustering Sparse Data Mr. Hieu Pham PhD Student, Department of Industrial and Manufacturing Systems Engineering Iowa State University Wednesday, April 11, 2018, 4:10 pm, 1140 Howe Hall Abstract Biclustering is statistical learning methodology that simultaneously partitions rows and columns of data values into homogeneous subsets. Biclustering is known to be an NP-hard problem, and therefore various heuristic approaches have been proposed in literature. These strategies breakdown when dealing with any degree of sparsity in a two-way table of data values. To remedy this, we propose a new prototype-based biclustering method, based on the work of Li (2014). Numerical results show the prototype-based approach performs well on moderate-sized test cases with a large missing-value percentage (95%+). A large agricultural case study (where rows represent plant varieties, columns represent planting locations, data are yield values, and genetics by environment (GxE) interactions are of interest) is used to illustrate the practical usefulness of the method. This work is supported in part by Syngenta Seeds. About the Speaker Hieu Pham is a Ph.D. student in the Industrial and Manufacturing Systems Engineering department at Iowa State University. He received his master’s and bachelor’s degree in pure mathematics from Kansas State University and Tennessee Technological University, respectively. His research interests include: machine learning, data mining, and sports analytics. He is a proud member of the Vietnamese Student Association and the Asian Student Union. 3004 Black Engineering Bldg. Iowa State University Ames, IA 50011 Web: www.imse.iastate.edu

Spring 2018 Graduate Seminar Series Biclustering Sparse DataApr 11, 2018  · Biclustering is known to be an NP-hard problem, and therefore various heuristic approaches have been proposed

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Spring 2018 Graduate Seminar Series Biclustering Sparse DataApr 11, 2018  · Biclustering is known to be an NP-hard problem, and therefore various heuristic approaches have been proposed

Spring 2018 Graduate Seminar Series

Biclustering Sparse Data Mr. Hieu Pham PhD Student, Department of Industrial and

Manufacturing Systems Engineering Iowa State University

Wednesday, April 11, 2018, 4:10 pm, 1140 Howe Hall

Abstract Biclustering is statistical learning methodology that simultaneously partitions rows and columns of data values into homogeneous subsets. Biclustering is known to be an NP-hard problem, and therefore various heuristic approaches have been proposed in literature. These strategies breakdown when dealing with any degree of sparsity in a two-way table of data values. To remedy this, we propose a new prototype-based biclustering method, based on the work of Li (2014). Numerical results show the prototype-based approach performs well on moderate-sized test cases with a large missing-value percentage (95%+). A large agricultural case study (where rows represent plant varieties, columns represent planting locations, data are yield values, and genetics by environment (GxE) interactions are of interest) is used to illustrate the practical usefulness of the method. This work is supported in part by Syngenta Seeds. About the Speaker Hieu Pham is a Ph.D. student in the Industrial and Manufacturing Systems Engineering department at Iowa State University. He received his master’s and bachelor’s degree in pure mathematics from Kansas State University and Tennessee Technological University, respectively. His research interests include: machine learning, data mining, and sports analytics. He is a proud member of the Vietnamese Student Association and the Asian Student Union.

3004 Black Engineering Bldg. Iowa State University Ames, IA 50011

Web: www.imse.iastate.edu