43
Author Biographies Dr. Tommy L. Binford Jr. holds a B.S. in physics and mathematics (1998) and an M.S. in physics (2000) from Sam Houston State University (Huntsville, TX) where he performed research on low-temperature superconducting materials. He completed a Ph.D. in computational and applied mathematics at Rice University (Houston, TX) in 2011. In 2000, he joined the oil industry to work in research and development. He is Senior Staff Scientist at RD&E of Weatherford International studying, supporting, and developing logging-while-drilling technology. His current interests include resistivity modeling, sensor development, computational electro- magnetics, modeling and inversion, numerical optimization, and high-performance computing. © Springer International Publishing AG 2018 S. Srinivasan (ed.), Guide to Big Data Applications, Studies in Big Data 26, DOI 10.1007/978-3-319-53817-4 523

Author Biographies - Home - Springer978-3-319-53817-4/1.pdf · Author Biographies Dr. Tommy L. Binford ... development and professional services experience at companies such as Sumo

  • Upload
    buingoc

  • View
    215

  • Download
    1

Embed Size (px)

Citation preview

Author Biographies

Dr. Tommy L. Binford Jr. holds a B.S. in physics and mathematics (1998) andan M.S. in physics (2000) from Sam Houston State University (Huntsville, TX)where he performed research on low-temperature superconducting materials. Hecompleted a Ph.D. in computational and applied mathematics at Rice University(Houston, TX) in 2011. In 2000, he joined the oil industry to work in research anddevelopment. He is Senior Staff Scientist at RD&E of Weatherford Internationalstudying, supporting, and developing logging-while-drilling technology. His currentinterests include resistivity modeling, sensor development, computational electro-magnetics, modeling and inversion, numerical optimization, and high-performancecomputing.

© Springer International Publishing AG 2018S. Srinivasan (ed.), Guide to Big Data Applications, Studies in Big Data 26,DOI 10.1007/978-3-319-53817-4

523

524 Author Biographies

Dr. Ann Cavoukian is recognized as one of the world’s leading privacy experts.She is presently the Executive Director of Ryerson University’s Privacy andBig Data Institute. Dr. Cavoukian served an unprecedented three terms as theInformation & Privacy Commissioner of Ontario, Canada. There she created Privacyby Design, a framework that seeks to proactively embed privacy into design,thereby achieving the strongest protection possible. In 2010, International PrivacyRegulators unanimously passed a Resolution recognizing Privacy by Design as aninternational standard. Since then, PbD has been translated into 39 languages.

Dr. Cavoukian has received numerous awards recognizing her leadership inprivacy, most recently as a Founder of Canada’s Digital Economy.

In her leadership of the Privacy and Big Data Institute at Ryerson University,Dr. Cavoukian is dedicated to demonstrating that Privacy can and must be included,side by side, with other functionalities such as security and business interests. Hermantra of “banish zero-sum” enables multiple interests to be served simultane-ously – not one, to the exclusion of another.

Dr. Jiefu Chen received the B.S. degree in engineering mechanics and the M.S.degree in dynamics and control from Dalian University of Technology, Dalian,China, in 2003 and 2006, respectively, and the Ph.D. degree in electrical engineeringfrom Duke University, Durham, NC, in 2010.

Author Biographies 525

He was with the Department of Electrical and Computer Engineering, DukeUniversity, as a Research Assistant from September 2007 to December 2010.He was a Staff Scientist with Advantage R&D Center, Weatherford International,Houston, TX, from March 2011 to August 2015. Since September 2015, he hasbeen with University of Houston, Houston, TX, where he is currently an AssistantProfessor of Electrical and Computer Engineering. His research interests includecomputational and applied electromagnetics, multiphysics modeling and inversion,electronic packaging, subsurface wireless communication, and well logging.

Dr. Jonathan H. Chen M.D., Ph.D. is an Instructor in the Stanford Department ofMedicine. After a Ph.D. in Computer Science, he completed training in InternalMedicine and a VA Research Fellowship in Medical Informatics. He continuesto practice medicine with research interests in data-mining electronic medicalrecords for insights into medical decision making. With the support of an NIH BigData 2 Knowledge Career Development Award, he is systematically extracting thecollective wisdom of practicing clinicians from electronic health records. This willtranslate endpoint clinical data into an executable form of expertise in a closed looplearning health system.

Dr. Hongmei Chi is an Associate Professor of Computer & Information andSciences at the Florida A&M University. She currently teaches graduate andundergraduate courses in data mining and Cyber Security and researches in areasof big data and applied security. Dr. Chi has published many articles related to datascience, parallel computing, and cyber security research and education. Her webpage is www.cis.famu.edu/~hchi.

526 Author Biographies

Michelle Chibba is a Strategic Privacy/Policy Advisor at the Privacy and Big DataInstitute, Ryerson University. She was Director, Policy Department and SpecialProjects at the Office of the Information and Privacy Commissioner of Ontario,Canada (IPC). During her 10-year tenure at the IPC, she was responsible forconducting research and analysis as well as liaising with a wide range of stake-holders to support proactively addressing privacy and technology issues affectingthe public, otherwise known as Privacy by Design. Michelle received a master’sdegree from Georgetown University (Washington, D.C.), with a focus on ethics andinternational business. She is a frequent speaker on Privacy by Design and emergingdata privacy/technology issues and has written a number of publications on privacyand technology.

Dr. Wenrui Dai received B.S., M.S., and Ph.D. degree in Electronic Engineeringfrom Shanghai Jiao Tong University (SJTU), Shanghai, China in 2005, 2008, and2014. He is currently a postdoctoral researcher at the Department of BiomedicalInformatics, University of California San Diego. His research interests includelearning-based image/video coding, image/signal processing and predictive mod-eling.

Author Biographies 527

Rishi Divate, Co-founder and Vice President of Engineering, MityLytics Inc. Rishico-founded MityLytics in 2015 and is responsible for the development of theMityLytics product. Prior to MityLytics, Rishi spent over 16 years of productdevelopment and professional services experience at companies such as SumoLogic, HP, ArcSight, Oracle and Spinway. Rishi has led enterprise software andSaaS deployments in the high performance, security, analytics, middleware anddatabase areas for a variety of customers including Fortune 500 and mid-sizedcompanies, startups, universities and government agencies in North America,Europe and Asia. He has been a session speaker at industry conferences suchas Oracle Open World and ArcSight Protect. Rishi received an M.S. degree incomputer science from the University of Houston and a B.Engg degree in computerengineering from the University of Pune.

Bogdan Gadidov is a Ph.D. candidate student at Kennesaw State University.He also completed his Master’s degree in Applied Statistics at Kennesaw StateUniversity prior to enrolling in the Ph.D. program. His undergraduate degree is inIndustrial Engineering from Georgia Tech. Prior to enrolling in graduate school,he worked for a year as an implementation engineer at Noble Systems, whichspecializes in telecommunication solutions for call centers. While at KennesawState University, he has enjoyed teaching undergraduate courses in algebra andelementary statistics as a graduate teaching assistant. In 2014, he was awardedas a SAS Analytics Student Poster Winner by the SAS Institute at their yearlyAnalytics conference for presenting on risk model validation following an internshipat SunTrust bank.

528 Author Biographies

Dr. Cuilan Gao is an Assistant Professor of Statistics at the University of Tennesseeat Chattanooga (UTC) in USA. She received her M.S. and Ph.D. in statistics fromthe University of Mississippi in USA in 2010. After graduation, she had beenworking as a Postdoctoral Research Associate in the Department of Biostatisticsat St. Jude Children Research Hospital in Memphis in USA from 2010 to 2012,where she developed statistical methods and conducted the design and analyses oflaboratory-based experiments, including genetics and genomic studies for pediatricbrain tumors. After joined UTC in 2012, she continues her research interests instatistical methods on computational biology, analysis of high dimensional data andlarge scale data sets. Dr. Gao is also an experienced collaborator and award–winningteacher. She had wide collaborations with researchers from cancer research, publichealth and computers science etc. She won the 2016 Alumni Outstanding Teacheraward across the University of Tennessee system.

Dr. Gintare Giriuniene, Ph.D. is a lecturer of Business Management Systemsat Vilnius University in Lithuania. She was educated at Kaunas University ofTechnology where she took her first degree in Management followed by a Ph.D. inEconomics which she completed in 2014. Her research interests lie in accounting,audit and information systems, and she is an author of five handbooks and more thanthirty scientific papers.

Author Biographies 529

Dr. Yueqin Huang received her B.S. degree in electrical and computer engineeringfrom Jimei University, China in 2005, and her M.S. and Ph.D. degrees in electricalengineering from Xiamen University, China in 2007 and 2011, respectively. Duringher Ph.D. studies, she spent two years as a visiting scholar in the Department ofElectrical and Computer Engineering at Duke University, Durham, NC, USA.

Following her graduation, Dr. Huang worked as an Assistant Professor for oneyear in Department of Electronic Science at Xiamen University and continued herresearch in forward and inverse modeling in seismic and electromagnetic waveapplications. Since 2015, Dr. Huang has been the owner and a Research Scientistof Cyentech Consulting LLC, Cypress, TX. Her research interests include groundpenetrating radar, modeling and inversion of resistivity well logging, and signalprocessing.

Dr. Dryver Huston has been an engineering faculty member at the Universityof Vermont since 1987. He has over thirty years of experience in developingand implementing systems for assessing the performance and health of structuralsystems, along with electromechanical and precision instrument design. Currentresearch projects include developing methods for monitoring and mapping under-ground utilities, ground penetrating radar methods for detecting buried landmines,developing intrinsic shape sensing networks for inflatable structures, monitoring

530 Author Biographies

bridges during accelerated construction, flood scour effect measurements, self-healing wiring systems, soft robotic systems for patient handling and avian lungbased extracorporeal oxygenators. Dr. Huston has a PhD (1986) and MA (1982)from Princeton University in Civil Engineering and a BS (1980) from the Universityof Pennsylvania in Mechanical Engineering.

Dr. Sowmya S. Iyer, M.D., MPH is a Geriatric Medicine physician at the PaloAlto Veterans Affairs Medical Center and a Clinical Assistant Professor (Affiliated)of Medicine at Stanford University. She earned her undergraduate degree in Musicand MD at the University of Louisville. She then completed her Internal Medicineresidency at Kaiser Permanente Oakland Medical Center. She completed herMasters in Public Health at the University of California, Berkeley and her clinicalfellowship in Geriatric Medicine at Stanford University. Her professional interestsinclude improving dementia care throughout health systems, quality improvement,medical education, and medical journalism.

Dr. Shankar Iyer is a Staff Data Scientist at Quora. He works closely with thecompany’s Core Product and Quality Team to conduct data analyses that directlyinform product decisions. He also leads the Quora Data Science Team’s researchefforts. Prior to joining Quora in 2013, Shankar completed a Ph.D. in theoreticalcondensed matter physics at the California Institute of Technology, where he studiedphase transitions in quantum materials.

Author Biographies 531

Pierre Jean has twenty years of oil and gas experience from research scientist toproject manager and Asia business manager. Pierre started his career with two M.Sc.degrees, the first one in theoretical physics and the second in microelectronics. Hehas worked at developing new oil and gas measurement tools (optics, sonic, massspectrometer, and high pressure), integrating software and real-time measurement atDaniel Industries, Commissariat a l’Energie Atomique, Weatherford and Schlum-berger. In the last 5 years before creating Antaeus Technologies, Pierre Jean wassoftware business manager at Schlumberger in Asia - where he grew the businessfrom $500K to 8M over 2.5 years with a very limited team - and in 2014 movedto further develop the software business in North America for Schlumberger. Thisworldwide technical and business experience gave Pierre Jean the right hindsight asto the market needs in various regions of the world.

Dr. Xiaoqian Jiang is an assistant professor in the Department of BiomedicalInformatics, UCSD. He received his PhD in computer science from Carnegie MellonUniversity. He is an associate editor of BMC Medical Informatics and DecisionMaking and serves as an editorial board member of Journal of American MedicalInformatics Association. He works primarily in health data privacy and predictivemodels in biomedicine. Dr. Jiang is a recipient of NIH K99/R00 award and he wonthe distinguished paper award from AMIA Clinical Research Informatics (CRI)Summit in 2012 and 2013.

532 Author Biographies

Pankush Kalgotra is a doctoral candidate majoring in Management Science andInformation Systems at Oklahoma State University (OSU). His research interestsinclude healthcare analytics, network science, dark side of IT and neuroimaging inInformation Systems. He has more than five years of experience with Data Mining,Texting Mining, Sentiment Analysis and Big Data Analytics. He is proficient inusing and teaching Teradata Aster, a Big Data Platform. He is a SAS® certifiedPredictive Modeler and has been awarded with SAS Student Scholar award in 2013and SAS Student Ambassador Award in 2014. His team won the SAS Shootoutcompetition in 2014. For his teaching effectiveness, he received Spears School ofBusiness Outstanding Graduate Teaching Associate Award in 2015 and selected asa teaching mentor by the Institute for Teaching and Learning in 2016. He was alsoawarded the Distinguished Graduate Fellowship in 2015 and 2016.

Dr. Igor Katin, Ph.D. is a lecturer with Department of Economic Informatics,Faculty of Economics at Vilnius University, Lithuania. He gained his Ph.D. fromInformatics Engineering Department of Vilnius University, Institute of Mathematicsand Informatics. His research and teaching interests include big data analytics, datamining, software systems and modeling, IT technologies, game theory, local andglobal optimization.

Author Biographies 533

Samsheel Kumar Kathuri is a Graduate Student in Management InformationSystems majoring in Data Analytics at Oklahoma State University. He is enthu-siastic and a result-oriented professional, well-versed in analyzing the data andimplementing high-impact strategies to target new business opportunities. Overthe past 5 years he is been deeply involved in Data Analytics, Business Analysis,Business Intelligence and Reporting. He has worked for 4 years at Tata ConsultancyServices and over 1 year as a Graduate Research Assistant at Oklahoma StateUniversity. Samsheel was one of the overall winners at ’2016 Teradata AnalyticsChallenge’ for the work on Health Analytics. He has been offered a position asa Data Science, Senior Consultant at CVS Health. He is looking forward for alearning oriented career in the field of Health Analytics.

Dr. Michail Kazimianec, Ph.D. is a lecturer with the Department of EconomicInformatics, Faculty of Economics at Vilnius University, Lithuania. He received hisPh.D. degree in Computer Science from Free University of Bozen-Bolzano, Italy.His current research and teaching interests include business intelligence automationtechnologies and application of predictive analytics as well as of big data analyticsin business intelligence.

534 Author Biographies

Mark Kerzner is an experienced/hands-on Big Data architect. He has beendeveloping software for over 20 years in a variety of technologies (enterprise, web,HPC) and for a variety of verticals (healthcare, O&G, legal, financial). He currentlyfocuses on Hadoop, Big Data, NOSQL and Amazon Cloud Services. Mark has beendoing Hadoop training for individuals and corporations; his classes are hands-onand draw heavily on his industry experience.

Mark stays active in the Hadoop/Startup communities. He runs Houston HadoopMeetup. Mark contributes to a number of Hadoop-based projects.

Dr. Elizabeth Le, M.D. is a practicing academic Hospitalist at the Palo AltoVeterans Affairs Medical Center and a Clinical Assistant Professor (Affiliated) ofMedicine at Stanford University. She earned her MD at the University of California,Los Angeles and completed an Internal Medicine Residency at the University ofCalifornia, San Francisco. Current interests include medical education and trainingat the residency level. Elizabeth currently lives in the Bay Area with her husbandand two rambunctious young children.

Author Biographies 535

Dr. Nan Li is an Associate Professor of Key Laboratory of Environment Changeand Resources Use in Beibu Gulf at Guangxi Teachers Education University. Hecurrently teaches graduate and undergraduate courses in Marine Microbial Ecologyand Genome Data Mining and researches in areas of microbial ecology and bioin-formatics. Dr. Li has published many articles related to mining information fromhuge genomic data, evolutionary analysis and microbial diversity. He researchgatewebsite profile is https://www.researchgate.net/profile/Nan_Li12?ev=hdr_xprf.

Dr. Ron C. Li M.D. is an internal medicine resident at Stanford. He has interestsin applied clinical informatics, with a focus on the implementation science ofinformatics and digital health tools in healthcare systems. He plans on continuinghis training as a clinical informatics fellow to better understand how to study andimplement innovations that improve the way clinicians make medical decisions andengage with patients.

536 Author Biographies

Yang Li received his B.S. degree in Information Security from NorthwesternPolytechnical University in 2014. He currently is a Ph.D. student in the sameuniversity.

His research interests include Natural Language Processing (word embedding,sentiment analysis, topic model), Deep Learning, etc.

Dr. Yaohang Li is an Associate Professor in the Department of Computer Scienceat Old Dominion University. He is the recipient of an NSF CAREER Award in 2009.Dr. Li’s research interests are in Computational Biology, Monte Carlo Methods, andScientific Computing. He received the Ph.D. and M.S. degrees in Computer Sciencefrom the Florida State University in 2003 and 2000, respectively. After graduation,he worked at Oak Ridge National Laboratory as a research associate for a shortperiod of time. Before joining ODU, he was an associate professor in the ComputerScience Department at North Carolina A&T State University.

Author Biographies 537

Dr. Yu Liang is currently working at the Department of Computer Science andEngineering of University of Tennessee at Chattanooga as an Associate Professor.His funded research projects cover the following areas: modeling and simulation,high-performance scientific and engineering computing, numerical linear algebra,the processing and analytics of large-scale sensory data, and computational mechan-ics. His research work has appeared in various prestigious journals, book andbook chapters, and refereed conference, workshop, and symposium proceedings.Dr. Liang is serving in the International Journal of Security Technology for SmartDevice (IJSTSD), Journal of Mathematical Research and Applications (JMRA), andCurrent Advances in Mathematics (CAM) as an editorial board member. . Dr. Lianghas a PhD in Computer Science (1998) from Chinese Academy of Sciences, a PhDin Applied Mathematics (2005) from University of Ulster, and a BS (1990) fromTsinghua University.

Dr. Guirong Liu received Ph.D. from Tohoku University, Japan in 1991. Hewas a PDF at Northwestern University, USA from 1991–1993. He is currently aProfessor and Ohio Eminent Scholar (State Endowed Chair) at the University ofCincinnati. He authored a large number of journal papers and books includingtwo bestsellers: “Mesh Free Method: moving beyond the finite element method”and “Smoothed Particle Hydrodynamics: a Meshfree Particle Methods.” He is theEditor-in-Chief of the International Journal of Computational Methods, AssociateEditor of IPSE and MANO. He is the recipient of numerous awards, includingthe Singapore Defence Technology Prize, NUS Outstanding University ResearcherAward and Best Teacher Award, APACM Computational Mechanics Awards, JSME

538 Author Biographies

Computational Mechanics Awards, ASME Ted Belytschko Applied MechanicsAward, and Zienkiewicz Medal from APACM. He is listed as a world top 1% mostinfluential scientist (Highly Cited Researchers) by Thomson Reuters in 2014, 2015and 2016.

Dr. Z. John Ma, P.E, F.A.S.C.E., received his Ph.D. degree in civil engineeringfrom University of Nebraska-Lincoln in 1998. He currently serves at the Universityof Tennessee at Knoxville (UTK) as a professor in the College of Engineering’scivil and environmental engineering department. Dr. Ma has conducted researchin the area of evaluation of ASR-affected structures; as well as reinforced andprestressed concrete structures including the investigation of shear behavior of thin-web precast bridge I-girders and the development of connection details and durableclosure-pour materials for accelerated bridge construction. He has been awardedthe NSF CAREER, ASCE Tennessee Section Outstanding Engineering Educator,ASCE Raymond C. Reese Research Prize, and ASCE T.Y. Lin Awards. He is anAssociate Editor for ASCE Journal of Structural Engineering and Journal of BridgeEngineering. He has also served as a member on several professional technicalcommittees within ASCE, ACI, PCI, and TRB.

Dr. Ali Miri has been a Full Professor at the School of Computer Science, RyersonUniversity, Toronto since 2009. He is the Research Director, Privacy and Big DataInstitute, Ryerson University, an Affiliated Scientist at Li Ka Shing Knowledge

Author Biographies 539

Institute, St. Michael’s Hospital, and a member of Standards Council of Canada, BigData Working Group. He has also been with the School of Information Technologyand Engineering and the Department of Mathematics and Statistics since 2001,and has held visiting positions at the Fields Institute for Research in MathematicalSciences, Toronto in 2006, and Universite de Cergy-Pontoise, France in 2007, andAlicante and Albecete Universities in Spain in 2008. His research interests includecloud computing and big data, computer networks, digital communication, andsecurity and privacy technologies and their applications. He has authored and co-authored more than 200 referred articles, 6 books, and 6 patents in these fields.Dr. Miri has chaired over a dozen international conference and workshops, and hadserved on more than 80 technical program committees. He is a senior member ofthe IEEE, and a member of the Professional Engineers Ontario.

Bhargav Molaka is a Graduate Student in Management Information Systems witha concentration in Business Analytics at Oklahoma State University. He is also aStatistical Analyst at University Assessment and Testing Center, OSU. Bhargavis a SAS Certified Business Analyst, Base and Advanced Programmer with aspecialization in data mining and 4 years of professional experience in Businessintelligence, ETL, and Data Analysis. Bhargav is one of the Overall Winners ofTeradata University Network’s Student Analytics Challenge at Teradata PartnersConference, 2016. He was awarded for their Big Data Project on Health Analyticswhich was done using Teradata Aster, App Center. Recently, he has been offered theposition of Senior Credit Analyst from Bluestem Brands, Inc.

540 Author Biographies

Dr. Teja Suhas Patil M.D. is a hospitalist at the VA Palo Alto Health Case Systemand a clinical instructor of medicine at Stanford University. She has a B.S. inCell Biology and Biochemistry from the University of California, San Diego and aMasters in Public Health from the University of Michigan, Ann Arbor. Her specialinterest is in medical education.

Dr. Sharmini Pitter received her Ph.D. in Environmental Science from StanfordUniversity in 2014. During her graduate study, she conducted research in theDepartment of Environmental Earth System Science in collaboration with theStanford Archaeology Center. Her dissertation research focused on the link betweenchanges in the paleoenvironment, cultural technology, and agricultural decision-making during the Neolithic period of Turkey. Her research interest focus is onthe connections between variables in complex social and environmental systems.Dr. Pitter is currently Project Coordinator for the FAMU Florida IT Career Alliance(FITC), a program that focuses on recruitment, retention, graduation, and careerplacement of the next generation of Florida’s technology workforce.

Author Biographies 541

Hoi Ting Poon is a graduate student in Computer Science at Ryerson University,Toronto, where he is currently a Ph.D. candidate. He also holds a degree in ElectricalEngineering and has authored various works in areas related to information security.His research interests include information security, cryptography, authenticationsystems, Cloud computing, searchable encryption, security in embedded systemsand applications of homomorphic encryption in addressing security and privacyissues. He is a student member of the IEEE.

Dr. Jennifer Lewis Priestley, Ph.D. is a Professor of Applied Statistics and DataScience at Kennesaw State University, where she is the Director of the Center forStatistics and Analytical Services. She oversees the Ph.D. Program in AdvancedAnalytics and Data Science, and teaches courses in Applied Statistics at theundergraduate, Masters and Ph.D. levels. In 2012, the SAS Institute recognized Dr.Priestley as the 2012 Distinguished Statistics Professor of the Year. She served asthe 2012 and 2015 Chair of the National Analytics Conference. Prior to receivingher career in academia, Dr. Priestley worked for Accenture, Visa EU, MasterCardand AT&T. She has authored articles on Binary Classification, Risk Modeling,Applications of Statistical Methodologies for Problem Solving, and Data ScienceEducation.

542 Author Biographies

Dr. Fatema Rashid is currently working as a Visiting faculty in Ryerson UniversitySchool of Continue Education, Computer Science Department, Toronto, Canada.She is also working as an Information Management Analyst in an IT oriented firm inToronto. She completed her Ph.D. from Ryerson University in 2015 with the thesischiefly focused on Big Data Security in Cloud computing and different strategies tosave storage space in clouds. She also completed her MS from Ryerson Universityin 2009 with major in Inverse Biometrics for user Authentication.

Sankalp Sah, Founding Engineer, MityLytics Inc. Sankalp was the first engineeringhire at MityLytics and is responsible for leading the development of productfeatures for Big Data batch, streaming and query processing technologies. He hasover eight years of software development experience in the field of large scalesystem development and networking. He has worked on products like the Ericsson’sConverged Packet Gateway (CPG) which have been deployed in the world’s firstLTE offerings from Verizon, TeliaSonera and MetroPCS to name a few. Whilstat Ericsson (formerly Redback Networks) he worked on an in house networkprocessing chipset that featured thousands of cores to enable products to scale from100 Gbps to 1 Tbps. He has also been involved in the mobile handset market withSamsung Electronics, where he helped commercialize phones for the Japan, Braziland the US market. Sankalp has a Bachelor’s degree from the Indian Institute ofTechnology (IIT) and a Master’s degree in computer engineering from Texas A&M.

Author Biographies 543

Dr. Mohamed Sayeed (Ph.D. CE North Carolina State University, 2003) is acomputational scientist and a faculty associate at Arizona State University. His areaof expertise is in Advanced Computing/Cyber Infrastructure hardware and softwarefor scientific computing including high performance computing, high throughputcomputing, accelerated, grid and cloud based computing. His research and teachinginvolves interdisciplinary scientific computing using high performance computing.The current research and tool development efforts are to lower the barriers to useof parallel computing including efforts for automatic parallelization using machinelearning techniques. His research interests are parallel computing for big datacomputation and analytics, dynamical systems modeling, parallel numerical linearalgebra, numerical methods, machine learning, inverse modeling and optimization.

Dr. Tavpritesh Sethi (M.B.B.S., Ph.D.) is an Assistant Professor of Computa-tional Biology at Indraprastha Institute of Information Technology (IIIT), Delhiand a Wellcome Trust/DBT India Alliance Early Career Fellow at All IndiaInstitute of Medical Sciences (AIIMS), New Delhi, India. He is a clinician anda Data-scientist. With a bridge-expertise in medicine and computer science, heworks on developing actionable models for healthcare and his areas of researchinterests include social networks, machine learning and time series analysis forcritical-care and community-health settings. He is a recipient of the MIT-IndiaYoung Innovator Award for developing an exquisitely sensitive technology for

544 Author Biographies

early detection of small airway disease and Wellcome Trust/DBT India AllianceEarly\vadjustf\pagebreakg Career award for supporting his ongoing research ondeveloping machine learning and artificial intelligence models for early detection ofsepsis in pediatric abd neonatal Intensive Care Units at AIIMS, New Delhi, India.

Dr. Ashfaque B. Shafique (Ph.D. EE Arizona State University, 2016) graduatedfrom Arizona State University with specialization in control theory. His researchinvolves the detection and control of Epileptic seizures through the use of controltheory, chaos theory and signal processing. Additional interests are in controlsystems and their deployment trough embedded controllers, adaptive control, robustcontrol, system identification and chaos theory. He has worked on various projectsinvolving the application of control theory, namely the design of discrete-timePID controllers through frequency loop-shaping. He has also worked on systemidentification and control of VTOL model aircrafts and model heating problems.

Dr. Ramesh Sharda is the Vice Dean for Research and Graduate Programs,Watson/ConocoPhillips Chair and a Regents Professor of Management Scienceand Information Systems in the Spears School of Business at Oklahoma StateUniversity. He has coauthored two textbooks (Business Intelligence and Analytics:Systems for Decision Support, 10th edition, Prentice Hall and Business Intelligence:

Author Biographies 545

A Managerial Perspective on Analytics, 3rd Edition, Prentice Hall). His researchhas been published in major journals in management science and informationsystems including Management Science, Operations Research, Information SystemsResearch, Decision Support Systems, Decision Science Journal, EJIS, JMIS, Inter-faces, INFORMS Journal on Computing, ACM Data Base and many others. He is amember of the editorial boards of journals such as the Decision\vadjustf\pagebreakgSupport Systems, Decision Sciences, and Information Systems Frontiers. He iscurrently serving as the Executive Director of Teradata University Network andreceived the 2013 INFORMS HG Computing Society Lifetime Service Award.

Manish Singh, Co-founder, CEO and CTO, MityLytics Inc. Manish co-foundedMityLytics in 2015 and is responsible for business and technology strategy andexecution at MityLytics. Manish has 17 years product development experienceat various Silicon Valley based enterprise software product companies namelyGoGrid (acquired by Datapipe), Netscaler (acquired by Citrix), Ascend (acquired byLucent) and Redback Networks (acquired by Ericsson). He has developed revenuegenerating features and products used in datacenters at Google, Amazon and severalfinancial companies. He received an M.S. degree in computer science from theUniversity of Houston and a B.S. degree in computer science from Banaras HinduUniversity.

Dr. Rimvydas Skyrius, Ph.D., is a Professor and head of the Economic Informaticsdepartment at the University of Vilnius, Lithuania. He received his Ph.D. inOperations Research and Computer Applications from ASU-Moscow Institute in

546 Author Biographies

1986, and his Master’s degree from the University of Vilnius in 1978. His principalresearch areas are IT-based decision support in business and management, businessintelligence and management information needs, and he has published a monograph,a number of articles and conference papers on the subject, as well as co-authoredseveral textbooks in the field.

Dr. Erica Sobel, DO, MPH is a hospital medicine physician at Kaiser PermanenteSanta Clara Medical Center. She earned her DO at Touro University California andcompleted her Internal Medicine Residency at Kaiser Permanente Oakland MedicalCenter. Erica completed her Masters in Public Health at the University of California,Berkeley. Her professional interests include medical education and health policy.

Dr. S. Srinivasan is the Associate Dean for Academic Affairs and Research as wellas the Distinguished Professor of Information Systems at the Jesse H. Jones (JHJ)School of Business at Texas Southern University (TSU) in Houston, Texas, USA.He is the Director of Graduate Programs at the JHJ School of Business. Prior tocoming to TSU, he was Chairman of the Division of International Business andTechnology Studies at Texas A & M International University in Laredo. He spent23 years at the University of Louisville (UofL) in Kentucky where he started theInformation Security Program as a collaborative effort of multiple colleges. He wasDirector of the InfoSec program until 2010 when he left for Texas. The program was

Author Biographies 547

designated a National Center of Academic Excellence in Information AssuranceEducation by the US National Security Agency and the Department of HomelandSecurity. He successfully wrote several grant proposals in support of the InfoSecProgram. His two books on Cloud Computing are “Security, Trust, and RegulatoryAspects of Cloud Computing in Business Environments” and “Cloud ComputingBasics”. His area of research is Information Security. He is the Editor-in-Chief forthe Southwestern Business Administration Journal. He has taught Management ofInformation Systems and Computer Science courses. He spent his sabbatical leavesfrom UofL at Siemens in their R & D facility in Munich, Germany; UPS Air Groupin Louisville, KY; and GE Appliance Park in Louisville, KY. Besides these industryexperiences, he has done consulting work for US Army, IBM and a major hospitalcompany in Louisville, KY. He is currently a Cybersecurity Task Force member ofthe Greater Houston Partnership.

Dr. Haiyan Tian is an Associate Professor of Mathematics at the University ofSouthern Mississippi. Her research interests include ordinary and partial differentialequations, applied analysis, computational mathematics, numerical analysis, andmathematical modeling. She is also actively involved in math education and sinceeight consecutive years she receives from the US Department of Education, throughMississippi Institutions of Higher Learning, funding for hosting the USM SummerMath Institute for mathematics teachers. Her webpage is https://www.usm.edu/math/faculty/haiyan-tian

Dr. Konstantinos Tsakalis (Ph.D. EE University of Southern California, 1988)is currently a Professor and Undergraduate Program Chair of the Department of

548 Author Biographies

Electrical, Computer and Energy Engineering at Arizona State University. Hisexpertise is in the theory and applications of control systems, adaptive control,system identification and optimization. He has worked on the integrated systemidentification and controller design and the implementation of high-performancemultivariable controllers for semiconductor manufacturing applications. He hasalso worked on the application of robust control theory, system identification andoptimization principles in various industrial problems in collaboration with Honey-well and EPRI. More recently, his activities include power system and biomedicalapplications, and in particular, prediction and control of epileptic seizures. Hiseducational objectives are to provide students with an operational understandingand hands-on experience with modern system identification and feedback controllerdesign techniques and implementation of embedded control systems.

Dr. Michael Wang M.D. graduated from Harvard College with a BA in Biochem-istry. He then obtained his MD from Loyola University of Chicago Stritch School ofMedicine before completing his residency in Internal Medicine at Alameda HealthSystem in Oakland, CA. He is currently a clinical informatics fellow at UCSF withan interest in natural language processing, learning health systems, underservedmedicine, and genomics.

Dr. Shuang Wang received the B.S. degree in applied physics and the M.S. degreein biomedical engineering from the Dalian University of Technology, China, andthe Ph.D. degree in electrical and computer engineering from the University of

Author Biographies 549

Oklahoma, OK, USA, in 2012. He was worked as a postdoc researcher withthe Department of Biomedical Informatics (DBMI), University of California, SanDiego (UCSD), CA, USA, 2012–2015. Currently, he is an assistant professor at theDBMI, UCSD. His research interests include machine learning, and healthcare dataprivacy/security. He has published more than 60 journal/conference papers, 1 bookand 2 book chapters. He was awarded a NGHRI K99/R00 career grant. Dr. Wang isa senior member of IEEE.

Joe Weinman is the author of the seminal Cloudonomics: The Business Valueof Cloud Computing (Wiley, 2012) which remains a top-selling book in CloudComputing over 4 years after publication, and Digital Disciplines: Attaining MarketLeadership via the Cloud, Big Data, Social, Mobile, and the Internet of Things(Wiley CIO, 2015), which was the Amazon #1 Hot New Release in Computers &Technology. These books have been translated into 3 Chinese editions. He is also thecontributing editor for Cloud Economics for IEEE Cloud Computing magazine, andhas been named a “Top 10 Cloud Computing Leader,” among many other accolades.

Dr. Dalei Wu received the B.S. and M.Eng. degrees in Electrical Engineeringfrom Shandong University, Jinan, China, in 2001 and 2004, respectively, and thePh.D. degree in Computer Engineering from the University of Nebraska-Lincoln,Lincoln, NE, USA, in 2010. From 2011 to 2014, He was a Postdoctoral ResearchAssociate with the Mechatronics Research Laboratory, Department of Mechanical

550 Author Biographies

Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA. SinceAugust 2014, he has been an Assistant Professor with the Department of ComputerScience and Engineering at the University of Tennessee at Chattanooga (UTC).His research interests include intelligent systems, networking, and cyber-physicalsystems.

Dr. Xuqing Wu earned a B.S. degree in Electrical Engineering from Universityof Science and Technology Beijing in 1995. He also earned a Master of Sciencein Mechanical Engineering from the University of Alberta in 1999 and Master ofScience in Computer Science from the Carleton University in 2001. Dr. Wu receivedhis Ph.D. degree in Computer Science from the University of Houston in 2011.Dr. Wu worked as a software engineer from 2001 to 2007. Dr. Wu finished his2-years Postdoc Fellowship at the University of Houston. Dr. Wu was a researchand data scientist of the Schlumberger-Doll research center before he joined theUniversity of Houston as an Assistant Professor in the Department of Information& Logistics Technology in 2015. Dr. Wu has been actively involved in research areasof Big Data, Predictive Modeling and Forecasting, High Performance Computing,Mobile and Cloud Computing, Probabilistic Multi-Physics Modeling, and ScientificVisualization.

Dr. Tao Yang received the M.S. degree in Computer Science and the Ph.D. degreein Automation Control Engineering from Northwestern Polytechnical University,Xi’an, China, in 2009 and 2012 respectively. From 2009 to 2010, he was a

Author Biographies 551

visiting Ph.D. in the Department of Computer Science Engineering, Ohio StateUniversity. Before his current position, he worked as a postdoctoral researcher atthe Department of Computer Science in Xi’an JiaoTong University.

He is currently an Associate Professor in Northwestern Polytechnical University.His current research interests include data mining methodologies, machine learningalgorithms and information security, etc.

His research has been supported by NSFC, National Aerospace Science Founda-tion of China, and Chinese Postdoctoral Science Foundation, etc.

He has served as an executive at Bell Labs, AT&T, and HP, and was mostrecently Senior Vice President at Telx (recently acquired by Digital Realty). Hecurrently serves on the advisory boards of several technology companies. He hasa BS and MS in Computer Science from Cornell University and UW-Madison,respectively, and has completed executive education at the International Institutefor Management Development in Lausanne. He has been awarded 22 patents in avariety of technologies such as cloud computing, distributed storage, homomorphicencryption, TCP/IP multicasting, mobile telephony, and pseudoternary line coding.

Dr. Hongkai Xiong received the Ph.D. degree in communication and informationsystem from Shanghai Jiao Tong University (SJTU), Shanghai, China, in 2003.Since then, he has been with the Department of Electronic Engineering, SJTU,where he is currently a full Professor. His research interests include sourcecoding/network information theory, signal processing, computer vision and machinelearning. He has published over 170 refereed journal/conference papers. He is therecipient of the Best Student Paper Award at the 2014 IEEE Visual Communicationand Image Processing (IEEE VCIP’14), the Best Paper Award at the 2013 IEEEInternational Symposium on Broadband Multimedia Systems and Broadcasting(IEEE BMSB’13), and the Top 10% Paper Award at the 2011 IEEE InternationalWorkshop on Multimedia Signal Processing (IEEE MMSP’11). In 2014, he wasgranted National Science Fund for Distinguished Young Scholar and ShanghaiYouth Science and Technology Talent as well. He served as TPC members forprestigious conferences such as ACM Multimedia, ICIP, ICME, and ISCAS. Heis a senior member of the IEEE (2010).

552 Author Biographies

Dr. Raimundas Žilinskas, Ph.D. is an Associate Professor with the Department ofEconomic Informatics, Faculty of Economics at Vilnius University, Lithuania. Hegained his Ph.D. degree from Vilnius University. His research and teaching interestsinclude Business Intelligence, Information Systems Strategies, and Early WarningSystems.

Index

AAccelerated innovation, 7–8

contest economics, 22–23contests and challenges, 22machine innovation, 23

Adaptable IO system (ADIOS), 352–353ADMM. see Alternating direction method of

multipliers (ADMM)Agent Based Models (ABMs), 385–386ALERT system, 110–111Alternating direction method of multipliers

(ADMM)distributed optimization, 56–58DLM method, 65–66DSVM, 64–65federated modeling techniques, 62PCA framework, 65regression, 63–64RNNs, 65

Amazon Web Services (AWS), 136Amyotrophic Lateral Sclerosis (ALS),

431–432Analogy precision, 9723andMe, 11Antaeus platform, 196–198Antiepileptic drugs (AEDs), 334–335Anti-Money Laundering (AML), 482–483Apache Spark, 112Apple products, 12, 196, 425, 426Apple’s HealthKit, 425–426Application programming interface (API),

337–338Approximate entropy, 393AROCK, 74AsySCD algorithm, 75

AsySPCD algorithm, 75ATP-binding cassette (ABC) systems, 120Attractors, 341Azimuthal resistivity LWD tools

deterministic inversion method, 164–166,170

HMC, 168inverted formation resistivities, 171MapReduce, 169measured vs. synthetic data, 171, 173measurement curves, 163real field data, 171, 172statistical inversion method, 166–168, 170structure and schematics, 163three-layer model, 169–170

BBanking

BDA, 463audio analytics, 473banking supervision, 468–471CEP, 472customers, 466–468data collection, 460–462data integration and consolidation

issues, 462expected benefits, 464fraud detection, 478–483operations, 468quality challenges, 462–463risks, 465–466robust analytics platform, 475–478social media analytics, 473text analytics, 472–473

© Springer International Publishing AG 2018S. Srinivasan (ed.), Guide to Big Data Applications, Studies in Big Data 26,DOI 10.1007/978-3-319-53817-4

553

554 Index

Banking (cont.)tools, 471tradeoffs, 459–460uneven expected business value, 471video analytics, 473visualization, 474

implications, 484–485information activities

analytical activities, 458–459corporate culture, 457definitions, 455drivers, 456–457factors, 455–456transaction processing systems, 454

Bank of Austria, 467Basic local alignment search tool (BLAST),

119Basin of attraction, 341Batch event processing, 129Bayesian graphical model, 154–157Bayesian inversion accuracy

graphical model, 154–157measurement errors, 153–154mixture model, 157–159

Bayesian Networks (BN), 395Bayes learning method, 68–69Behavioral intervention technology (BIT)

programs, 436Big Data analytics (BDA), 463

audio analytics, 473banking supervision

deposit insurance, 469efficiency and stability, 468EWS, 469–470features, 468–469financial crisis, 470–471resources, 469supervision authorities, 469–470

CEP, 472customers, 466–468data collection, 460–462data integration and consolidation issues,

462expected benefits, 464fraud detection

AML, 482–483credit card holders, 480–481cross-coverage test, 482data resources, 479fraud investigation and evaluation

process, 478–479fraud patterns, 479–480fraudulent activities, 480high-risk companies, 483

low-risk companies, 483pattern borders, 481pre-processing tier, 479real time prediction, 482track factors, 480

high performance computing (see Highperformance computing)

oil industryeventual consistency, 201–202fault tolerance, 202–203planning storage, 198–200

operations, 468quality challenges, 462–463risk management, 465–466robust analytics platform

Apache Hadoop, 475–476Apache Mahout, 476Apache Spark, 477attributes, 475enterprise data hub, 476–478

social media analytics, 473text analytics, 472–473tools, 471tradeoffs, 459–460uneven expected business value, 471video analytics, 473visualization, 474

Big Data as a Service (BDaaS), 132BLAST. see Basic local alignment search tool

(BLAST)BM. see Boltzmann machine (BM)Boltzmann machine (BM), 89Border-setting algorithm, 481BRAF Val600 mutations, 443BRCA1 mutations, 443BRCA2 mutations, 443Browser, 187–188BuildTree procedure, 72Butterfly effect, 342, 343Byte-level deduplication. see Content aware

data deduplication methods

CCare Coordination/Home Telehealth (CCHT)

program, 423Cassandra database, 130, 138, 140, 142, 192,

202, 204CIDPCA algorithm. see Covariance-free

iterative distributed principalcomponent analysis (CIDPCA)algorithm

Cinematch algorithm, 22Civil infrastructure serviceability evaluation

Index 555

Bayesian network, 322–323cloud service platform, 299data management civil infrastructure,

298–299global structural integrity analysis

big-data and inverse analysis, 315, 317computer analysis, 318–319data query, 317historical measured response frequency,

317, 318integrity level assessment, 319theoretical response frequency, 317, 318

localized critical component reliabilityanalysis

deep learning technique, 320–321infrastructure for, 320probe prolongation strategies, 312–322

mobile computing, 304–305MS-SHM-Hadoop (see Multi-scale

structural health monitoring systembased on Hadoop Ecosystem(MS-SHM-Hadoop))

nationwide civil infrastructure survey(see Nationwide civil infrastructuresurvey)

neural network based techniques, 298supervised and unsupervised learning

techniques, 298WSN, 298, 305

Client side deduplication, 249Cloud-based hardware deployment, 131–132Cloud computing, 6, 7, 187, 494, 502Cloud storage services

data deduplicationclient side deduplication, 249content aware, 248hash-based data deduplication methods,

248HyperFactor, 248inline data deduplication, 249level of deduplication, 248–249post-processing deduplication, 249secure image deduplication scheme

(see Secure image deduplicationscheme)

secure video deduplication scheme(see Secure video deduplicationscheme)

server-side deduplication, 249single-user vs cross-user deduplication,

250data privacy, 247

CNN. see Convolutional neural network(CNN)

Code of Fair Information Practices (FIPs), 38Collective intimacy, 7

high-level architecture, 18recommendation engine, 19–20sentiment analysis, 20target segments, 19upsell/cross-sell, 19

Comorbiditydiseases in TUD and non-TUD patients,

407, 409–411hospital visits in TUD and non-TUD

patients, 407, 410–413prevalence of diseases in TUD and

non-TUD patients, 407, 410–411three hospital visits with non-TUD patients,

408, 410–414Complex event processing (CEP), 472Compressed sensing (CS), 380Connected Cardiac Care Program (CCCP), 423Content aware data deduplication methods,

248Content based media search, 290–293Content marketing, 500–502Convolutional neural network (CNN), 93Corporate/business strategies

customer relationships, 4innovation process, 5processes, 4products and services, 4

Covariance-free iterative distributed principalcomponent analysis (CIDPCA)algorithm, 70

Cox proportional hazard model, 61–62Curse of dimensionality, 84Customer intimacy, 5–6Customer segmentation, 467

DData deduplication

client side deduplication, 249content aware, 248hash-based data deduplication methods,

248HyperFactor, 248inline data deduplication, 249level of deduplication, 248–249post-processing deduplication, 249secure image deduplication scheme (see

Image deduplication scheme)secure video deduplication scheme (see

Video deduplication scheme)server-side deduplication, 249single-user vs cross-user deduplication, 250

556 Index

Data ingestion cluster, 136Data-science roadmap

ABMs, 385–386capture reliably

biomedical data, 376data quality and standards, 377data sparsity, 377–379feature selection, 378–380Green Button approach, 382mHealth leverages mobile devices,

376–377physiological precision, 381state-of-the-art approach, 380Stratified Medicine, 381–382

challenges, 374–375enable decisions, 394

Bayesian Networks, 395predictive modeling, 396reproducibility, 396SAFE-ICU, 396–397

Eric Topol’s vision, 374–376information theory, 385networks medicine, 383–384phenotypic and physiological levels

cellular population, 388–389heart rate variability, 389–393pre-disease states, 388principal axes of variation, 389

POSEIDON study, 386–388transcriptomics, proteomics and

metabolomics, 374–375DataSpark, 10Data stores, 135DCD-Lasso algorithm, 63Decentralized architectures, 55Decentralized linearized ADMM (DLM),

65–66Deep brain stimulation (DBS), 336, 365Deep learning models

localized critical component reliabilityanalysis, 320–321

nationwide civil infrastructure survey,312–313

Degree-based friendship paradoxes, 208Delta-differencing deduplication. see Content

aware data deduplication methodsDeterministic inversion method, 164–166Detrended fluctuation analysis (DFA), 393Deutsche Bank, 456Dexcom Share2 app, 426Diabetes mellitus, 411, 413Digital disciplines

accelerated innovation, 7–8 (see alsoAccelerated innovation)

collective intimacy, 7 (see also Collectiveintimacy)

information excellence, 6 (see alsoInformation excellence)

solution leadership, 6–7 (see also Solutionleadership)

Disney MagicBands, 16Distributed recursive least-squares (D-RLS)

algorithm, 64D-Lasso algorithm, 63Document embedding, 99–100Domain specific languages (DSL), 338Downvoting, strong paradox

anti-correlation, 226, 227complementary cumulative distributions,

226–228content-contribution paradox, 230–234core questions, 223–224definition, 222“downvotee r downvoter” questions, 224,

225, 228, 229“downvoter r downvotee” questions, 224,

225, 228joint distribution, 226non-anonymous answers, 228, 229undownvoted downvoters, 226, 227vs. upvoting, 223

DQP-Lasso algorithm, 63

EEarly warning systems (EWS), 469–470Earth Science Data and Information System

(ESDIS) Project, 122eBird project, 122Edge computing (EC), 300, 302Elastic block storage (EBS), 137Electroencephalogram (EEG), 336Electronic health record (EHR)

data-science roadmap, 376–377, 382patient-physician relationship, 422,

424–426Electronic medical records (EMRs)

data-science roadmap, 376TUDs (see Tobacco use disorder (TUD))

Email filtering system, 282–285Environmental datasets, 112, 122Environmental microbiology, 117–118

big data analysis, 119–120genome dataset, 118–119

EpilepsyAEDs, 334–335big data problem, 355–356closed-loop control, 336–337

Index 557

control efficacy experiment, 360–362DBS, 336EEG, 336electrical stimulation, 356–357,

365–366functional models, 359incidence rates, 334Kantz algorithm, 347, 357–360long-standing clinical practice, 335mortality rates, 362–363open-loop control, 336PTE, 356real-time signals, 357seizures, 335, 358–359spatial synchronization, 359Spraque Dawley rats, 366–367STLmax algorithm, 336VNS, 335

European Banking Authority, 470Eventual consistency, 201–202EXpectation Propagation LOgistic REgRession

(EXPLORER) model, 60Experience Economy framework, 16

FFacebook, 208, 209Fault tolerance, 202–203Feed forward neural network, 87–88FIPs. see Code of Fair Information Practices

(FIPs)Fisher’s exact test, 73Florida hurricane datasets, 116–117

data analysis, 114–116dataset, 113–114

Ford Fusion’s EcoGuide SmartGauge, 14Friendship paradox

degree-based friendship paradoxes,208

Facebook, 208, 209Feld’s mathematical argument, 212–213generalized friendship paradoxes, 209immunization strategies design, 208marketing approaches, 207psychological consequences, 207Quora Follow Network (see Quora Follow

Network)random wiring, 214strong paradox (see Strong paradox)Twitter, 208, 209weak paradox, 209

generalized paradoxes, 215in undirected networks, 214–215

Fuzzy searches, 294

GGastro-esophageal reflux disease, 413Gaussian mixture model, 157–159GDPR. see General Data Protection Regulation

(GDPR)GE Flight Quest, 5, 22GE GEnx jet engine, 6–7, 15General Data Protection Regulation (GDPR),

31, 32Generalized friendship paradoxes, 209GeoFit

architecture, 189–190main screen, 194properties, 190–192workflow engine, 192–193

GeoSphere, 164Geosteering, 162–164, 166Global structural integrity analysis

big-data and inverse analysis, 315, 317computer analysis, 318–319data query, 317historical measured response frequency,

317, 318integrity level assessment, 319theoretical response frequency, 317, 318

Google 616 Google DeepMind’s AlphaGo, 21,23

Grand Rounds Quality Algorithm, 419–420Grid binary LOgistic REgression (GLORE)

framework, 58, 60GuideWave Azimuthal resistivity tool,

162–163

HHadoop, 198–199, 475–478Hamiltonian Monte Carlo (HMC), 168–169Hash-based data deduplication methods, 248Hash stamping, 204HBase, 204HDFC Bank, 467Healthcare Cost and Utilization Project

(HCUP), 50Health Grades model, 420–421Health Insurance Portability and

Accountability Act (HIPAA),50, 428–429

Heart rate variability (HRV)ANS, 389ECG, 390–391frequency domain analysis, 392–393nonlinear analyses, 393time domain analysis, 390, 392

Hierarchical Log BiLinear (HLBL) model, 95

558 Index

Hierarchical neural language model (HNLM),94–95

High performance computing (HPC), 337advanced hardware, 144data pipeline

defining, 128–130designing, 145–146

deployments, 130–133hardware considerations, 136–141intelligent software, 144on-premise hardware configuration, 142performance management, 144scaling up, 143, 147SDI, 142–143software considerations, 133–136

HMC. see Hamiltonian Monte Carlo (HMC)HNLM. see Hierarchical neural language

model (HNLM)Homomorphic encryption, 292–294Hosmer and Lemeshow (H-L) test, 54Hospital Consumer Assessment of Healthcare

Providers and Systems (HCAHPS),420–421

Hurricane Frances, 9HyperFactor, 248Hypertension, 409, 411–414

IIDC Big Data White paper, 485Ideal ecosystem, for oilfield actors, 182–183Identity-based encryption (IBE) scheme,

283–285Idiomaticity analysis, 98–99Image deduplication scheme

deduplication analysis, 256–259experimental settings, 255–256image compression, 252–253image hashing, 254–255partial image encryption, 253–254performance analysis, 259–260security analysis, 260–261

Information excellence, 6digital-physical substitution and fusion,

10–11dynamic, networked and virtual

corporations, 12exhaust-data monetization, 11governmental and societal objectives, 12high-level architecture for, 9long-term process improvement, 10resource optimization, 8–10

Inline data deduplication, 249Integrated disciplines, 24–25

International Mobile Equipment Identity(IMEI), 33

International Working Group on DataProtection in Telecommunications(IWGDPT 2004), 33

Internet of Things, 25Inverse problems, 151–152, 166, 172Inverse theory, 151

JJenkins, 204

KKafka nodes, 141Kelly bushing (KB), 196Keyword based media search, 289–290K-means clustering, 69

LLanguage modeling, 83Latent Dirichlet allocation (LDA), 86Latent semantic analysis (LSA), 86LDA. see Latent Dirichlet allocation (LDA)Levenberg-Marquardt algorithm (LMA), 165LMA. see Levenberg-Marquardt algorithm

(LMA)Localized critical component reliability

analysisdeep learning technique, 320–321infrastructure for, 320probe prolongation strategies, 312–322

Low cost subscription model, 184–185LSA. see Latent semantic analysis (LSA)Lyapunov exponents

epileptic animal EEG, 346–347Kantz algorithm, 349–350linearized approximation, 345–346maximum lyapunov exponent (Lmax), 346parallel computation, 353–355Rosenstein algorithm, 349–350Wolf algorithm, 347–348

MMachine translation, 99MapReduce, 112, 152, 159, 168, 169, 502Marketing

audience targetingcloud computing, 494products/services, 493–494social media, 492–493

Index 559

Spotify, 494–495US population, 493visitor experience, 494

forecastingBarnett’s description, 496Bureau of Labor Statistics, 498demand for, 495–496Hadoop, 497home appliances, 496–497McKinsey Global Reports, 498regression analysis and curve

smoothing, 495Spark, 497

MTA, 490–492predictive analytics, 498–500weaving Big Data, 502–503

Marketing analytics, 3Markov chain Monte Carlo (MCMC) method,

158, 159, 161, 167, 168Matrix-vector recursive neural network

(MV-RNN), 92Max-miner algorithm, 481McDonald, 24MCMC method. see Markov chain Monte

Carlo (MCMC) methodMedia

content based media search, 290–293keyword based media search, 289–290

Media Access Control (MAC), 33Melvin program, 23Message passing interface (MPI), 337Metroolis-Hastings algorithm, 159Micro-electromechanical systems (MEMS),

109Missing completely at random (MCAR), 378,

379MityLytics, 144, 145, 147Mobile health (mHealth)

CCHT program, 423jurisdiction and liability, 429–430mobile technologies, 422Partners Healthcare, 423–424from patients, 424–427regulation of, 429RPM, 422–423VHA, 423

Modified Saffir-Simpson wind scale, 114Monotone pattern, 378–379Multidimensional and time-variant (MDTV)

data, 406Multi-dimensional scaling (MDS), 378–380,

382

Multi-scale structural health monitoringsystem based on Hadoop Ecosystem(MS-SHM-Hadoop)

civil infrastructure performance evaluation,300

civil infrastructures construction methods,impact evaluation, 300

cutting-edge technologies, 300data fetching and processing, 300features, 299–300flowchart of, 303–304functions, 298infrastructure of, 301–302multi-scale structural dynamic modeling

and simulation, 300performance indicators determination, 300pipeline safety information, 298research samples screening, 300sensory data, 299supporting information systems, 299

Multi-touch attribution (MTA), 490–492Multiview LSA (MVLSA), 86MV-RNN. see Matrix-vector recursive neural

network (MV-RNN)MyChart app, 426

NNaive Bayes classifier, 68–69Named entity recognition, 98National Bridge Inventory Database, 308National Center for Biotechnology Information

(NCBI) Genome database, 118–119National Institutes of Health (NIH), 442, 445Nationwide civil infrastructure survey

data management, 314dimensionality reduction, 315, 316features, 306–308imputation, 314life-expectancy estimation

champion model selection, 313deep learning models, 312–313Markov chain models, 310, 312neural networks, 312statistical analysis, 309–310Weibull linear regression model, 310,

311National Bridge Inventory Database, 308variable transformation techniques, 314

Netflix, 4, 7, 18–20, 22, 24, 129Prize dataset, 275

Neural mass model, 336

560 Index

Neural network language model (NNLM),84–85

Neural networks, 396Newman configuration model, 214Newton-Raphson method

Cox proportional hazard model, 61–62distributed optimization, 56EXPLORER, 60federated modeling techniques, 58, 59generalized linear models, 58GLORE framework, 58, 60SMAC-GLORE, 61VERTIGO, 61WebDISCO, 62WebGLORE, 60

NGLY-1 deficiency, 432–433“n-gram” model, 83NikeC ecosystem, 15NNLM. see Neural network language model

(NNLM)Nonlinear systems

batch processing, 339cardiovascular applications, 363challenges, 338chaos theory

dense periodic orbits, 343–344dynamical system, 340logistic map, 343–344Lorenz system, 344–345phase space, 341, 344random/stochastic systems, 345real-world systems, 340sensitive dependence, 342sensitivity to initial conditions, 342–343state space, 341topological mixing, 343

Dryad tool, 339epilepsy

AEDs, 334–335closed-loop control, 336–337control efficacy experiment, 360–362DBS, 336EEG, 336electrical stimulation, 356–357,

365–366functional models, 359incidence rates, 334Kantz algorithm, 347, 357–360long-standing clinical practice, 335mortality rates, 362–363open-loop control, 336PTE, 356real-time signals, 357seizures, 335, 358–359

spatial synchronization, 359Spraque Dawley rats, 366–367STLmax algorithm, 336VNS, 335

HPCmatlabAPI, 351big data, 351DCS, 350parallel computing, 352–356POSIX threads and MPI, 350

Lyapunov exponentsepileptic animal EEG, 346–347Kantz algorithm, 349–350linearized approximation, 345–346maximum lyapunov exponent (Lmax),

346Rosenstein algorithm, 349–350Wolf algorithm, 347–348

Map-Reduce, 339parallel computing, 337–338stream processing, 339

Non-negative sparse coding (NNSC), 96Non-negative sparse embedding (NNSE), 96Null-model analysis, 232

OOffice of the National Coordinator for Health

Information Technology (ONC),442–443

Oilfield Big Dataazimuthal resistivity LWD tools

deterministic inversion method,164–166, 170

HMC, 168inverted formation resistivities, 171MapReduce, 169measured vs. synthetic data, 171, 173measurement curves, 163real field data, 171, 172statistical inversion method, 166–168,

170structure and schematics, 163three-layer model, 169–170

Bayesian inversion accuracygraphical model, 154–157measurement errors, 153–154mixture model, 157–159

petrophysicsAntaeus platform, 196–198cloud computing, 189–195eventual consistency, 201–202fault tolerance, 202–203implementation planning, 198–200

Index 561

PC-based application, 188–189project structure, 195timestamping, 196, 204

Omni-channel marketing, 11One-hot embedding, 83On-premise deployments, 132On-Road Integrated Optimization and

Navigation (ORION), 9Open Government Initiative, 111Operational excellence, 5Opower, 14Order preserving encryption (OPE), 288–289,

291–292

PPartitioning around Medoids (PAM), 382Part of the speech tagging, 98PASSCoDe-Atomic, 75PASSCoDe-Lock, 75Patient-generated health data (PGHD)

Apple’s HealthKit, 425–426Dexcom Share2 app, 426eClinicalWorks, 425ecosystem-enabling platforms, 424–425EHRs, 425health-related data, 424Microsoft Health and Google Fit, 427MyChart app, 426

Patient safety indicators (PSI), 421PatientsLikeMe, 431–432Payment card industry (PCI), 503PbD. see Privacy by Design (PbD)pbdR, 112PCA. see Principal component analysis (PCA)PCORnet, 382Perplexity, 97Personally identifiable information (PII), 31,

503Petrophysical software platform

collaboration, 181–185components, 179cost, 179–181knowledge, 185–186

Phrase embedding, 99Phrase searches, 294Phylogenetic analysis, 120–121Physician

clinical decision supportchallenges, 441–442data driven approach, 439–441fever, back pain, and nausea, 436–438probabilistic systems, 438–439rule-based approach, 439

treatment, 442–445patient–physician relationship

accessibility, 427–428Ginger.io, 435hospital quality, examination, 420–421Iodine.com, 433–434logistics, 427mHealth (see Mobile health (mHealth))Omada health, 435–436online communities, 431–433patient engagement, 430–431patient history, 422privacy and security, 428–429quality care, 418–419“quality verified” physician, identifying,

419–420Platform as a Service (PaaS), 131Poincarè-Bendixson theorem, 344Post-processing deduplication, 249Post-traumatic epilepsy (PTE), 356Powell’s algorithm, 68Precision agriculture, 111Precision Medicine Initiative (PMI), 442–445Predictive analytics, 15, 498–500Prevalence of Symptoms on a Single Indian

Healthcare Day on a NationwideScale (POSEIDON) study, 386–388

PriceWaterhouseCoopers (PwC), 464Principal component analysis (PCA), 65, 70,

95, 378–380Privacy by Design (PbD)

Big Data challengesantithesis of data minimization, 35–36correlation versus causation, 36–37lack of transparency/accountability,

37–38outsourcing, 34public health authorities, 33security challenges, 34

customer trust, 39FIPs, 387 Foundational principles

default/data minimization, 40, 42–43embedded in design, 40, 43–44positive-sum manner, 40, 44–45proactive and preventative, 40, 41respect and user-centric, 40security, 40visibility and transparency, 40

information privacyaggregation, 32confidential, 32contextual integrity, 31GDPR, 31, 32

562 Index

Privacy by Design (PbD) (cont.)informational self-determination, 30metadata, 32–33NIST definition, 31PII, 31pseudonymization, 31safekeeping/security, 30

Privacy-preserving federated data analysisADMM

distributed optimization, 56–58DLM method, 65–66DSVM, 64–65federated modeling techniques, 62PCA framework, 65regression, 63–64RNNs, 65

architecturesdecentralized, 55server/client, 53–55

asynchronous optimizationcoordinate gradient descent, 75fixed-point algorithms, 74spoke-hub architecture, 76

horizontally and vertically partitioned data,51, 52

Newton-Raphson methodCox proportional hazard model, 61–62distributed optimization, 56EXPLORER, 60federated modeling techniques, 58, 59generalized linear models, 58GLORE framework, 58, 60SMAC-GLORE, 61VERTIGO, 61WebDISCO, 62WebGLORE, 60

patient-level data, 51secure protocols, 53SMC

CIDPCA algorithm, 70ID3 decision tree, 71–72K-means clustering, 69Naïve Bayes classifier, 68–69PCA algorithm, 70RDT framework, 72regression, 66–68S2-MLR and S2-MC, 70sorting algorithms, 72–73spoke-hub and peer-to-peer

architectures, 66, 67SVM model, 70–71

Privacy-preserving support vector machine(PP-SVMV), 70–71

Privacy-protected recommender system, 294

Proactive geosteering. see GeosteeringProduct as a Product (PaaP), 179Product leadership, 5–6Proportional-integral (PI) controller, 336Protected health information (PHI), 428

QQuantified self movement, 107Quick serve restaurants

failure rate, 507social media data, 505–506source of employment, 506–507Yelp reviews

analysis, 516–517correlations, 513fast food experiences, 507non-franchise and franchise locations,

509–512numeric ratings, 508, 513R programming language (see Yelp

API)U.S. locations, 513–514word clouds, 515–516

Quora Follow Networkgoal, 209strong paradox

core questions, 217definition, 209in downvoting, 222–234strong degree-based paradoxes, 211,

215–221strong generalized paradoxes, 211,

215–216in undirected networks, 214–215in upvoting, 235–242

RRadial basis function (RBF) kernels, 70, 71Random decision tree (RDT) framework, 72Randomized controlled trials (RCTs), 382Randomized singular value decomposition

(R3SVD), 315Rank search, 294RapidMiner, 111RBM. see Restricted Boltzmann machine

(RBM)RDA method. see Regularized dual averaging

(RDA) methodReal-time process, 8–10, 135Recurrent neural network, 90–91Recursive neural network, 91–92Recursive neural tensor network (RNTN), 92

Index 563

Regularized dual averaging (RDA) method, 97Reinforcement learning, 385–386Remote patient monitoring (RPM), 422–423Remote sensing data, 111Rent neural networks (RNNs), 65Respiratory Sinus Arrhythmia (RSA), 393Restricted Boltzmann machine (RBM), 89–90Royal Bank of Canada (RBC), 467R packages, 112

SSample entropy, 393SDC. see Software defined compute (SDC)SDI. see Software defined infrastructure (SDI)SDN. see Software defined networking (SDN)SDS. see Software defined storage (SDS)Searchable encryption schemes

categories, 277data owner, 276fuzzy searches, 294homomorphic encryption, 292–294media

content based media search, 290–293keyword based media search, 289–290

phrase searches, 294privacy-protected recommender system,

294rank search, 294storage provider, 276symmetric encryption, 277text processing systems (see Text processing

systems)users, 276

Secure browser platform, 188Secure multiparty computation (SMC)

CIDPCA algorithm, 70ID3 decision tree, 71–72K-means clustering, 69Naïve Bayes classifier, 68–69PCA algorithm, 70RDT framework, 72regression, 66–68S2-MLR and S2-MC, 70sorting algorithms, 72–73spoke-hub and peer-to-peer architectures,

66, 67SVM model, 70–71

Secure two-party multivariate classification(S2-MC), 70

Secure two-party multivariate linear regression(S2-MLR), 70

Semantic analysis, 98Sensor networks

biocomplexity mapping, 110flood detection, 110–111forest fire detection, 109precision agriculture, 110–111

Sentiment analytics, 466Sentiment classification precision, 97–98Sepsis Advanced Forecasting Engine for ICUs

(SAFE-ICU) Initiative, 396–397Sequence analysis (SA), 406–408Server/client architecture, 53–55Server-side deduplication, 249Shannon entropy, 393Short term maximum Lyapunov exponent

(STLmax) algorithm, 336Signal reconstruction, 380Single instance storage (SIS), 249Single-user vs. cross-user deduplication, 250SMC. see Secure Multiparty Computation

(SMC)Software defined compute (SDC), 143Software defined infrastructure (SDI), 142–143Software defined networking (SDN), 143Software defined storage (SDS), 143Solution leadership

cable company, 15–16connected products and services, 17customer-centered data integration, 17customers’ financial health, 17digital-physical mirroring, 13Experience Economy framework, 16experiences, 16long-term product improvement, 15–16predictive analytics and maintenance, 15product-service system solutions, 15product/service usage optimization, 14real-time product/service optimization, 14transformations, 16

Spark nodes, 141Sparse coding approach (SPA), 84–85, 95–97SPIHT algorithm, 252–253Statistical inversion method, 166–168Stratified medicine, 381–382Streaming event processing, 129Strong degree-based paradoxes

anatomy of, 219–221in directed networks, 215–216typical values of degree, 217–218typical values of differences in degree, 218

Strong paradoxcore questions, 217definition, 209in downvoting, 222–234strong degree-based paradoxes, 211,

215–221

564 Index

Strong paradox (cont.)strong generalized paradoxes, 211,

215–216in undirected networks, 214–215in upvoting, 235–242

Support vector machine (SVM), 70–71, 396Syntax analysis, 98

TText processing systems

order preserving encryption, 288–289private/private search scheme

Bloom filters, 280–281encrypted indexes, 278–279flow diagram, 278

private/public search schemeadvantage, 282Bloom filter based scheme, 281email filtering system, 282–285flow diagram, 281–282

public/public search schemeasymmetric scheme, 287–288flow diagram, 285symmetric scheme, 286–287

Textual entailment, 99Thermodynamic entropy, 385Time stamping, 204Tobacco use disorder (TUD)

data preparationanalysis flowchart, 404–405non-TUD patients, 405–406sequence analysis, 406–408

time-based comorbiditiesdiseases in TUD and non-TUD patients,

407, 409–411hospital visits in TUD and non-TUD

patients, 407, 410–413prevalence of diseases in TUD and

non-TUD patients, 407, 410–411three hospital visits with non-TUD

patients, 408, 410–414Topical word embedding (TWE) model, 89Transcranial magnetic stimulation (TMS), 335TRUSTe’s Consumer Privacy Confidence

Index 2016, 37Twitter, 208, 209

UUnscented Kalman filter, 337Unsupervised clustering methods, 381Upvoting, strong paradox

content dynamics, 235–237

core questions, 237–239definition, 222NetworkX Python package, 238order-of-magnitude, 239potential impacts, 241–242practical consequences, 240upvoted answers, 239

VVagus nerve stimulation (VNS), 335Value disciplines

customer intimacy, 5–6operational excellence, 5product leadership, 5–6

Vector space model (VSM), 85–86Vertical grid logistic regression (VERTIGO),

61Veterans Health Administration (VHA), 423Video deduplication scheme

experimental results, 267–270flow diagram, 262H.264 video compression scheme, 262–264partial convergent encryption scheme,

265–267security analysis, 270–271unique signature generation scheme,

264–265ViziTrak, 164VSM. see Vector space model (VSM)

WWeak paradox, 209

generalized paradoxes, 215in undirected networks, 214–215

WebDISCO, 62Weibull linear regression model, 310, 311Welch’s test, 73Well integrity, 152, 160Wireless sensor network (WSN), 109–110,

298, 305Withings Smart Body Analyzer, 15Word embedding

applications, 98–100evaluations, 97–98goal, 84LDA, 86LSA, 86models, 100–101NNLM, 86–95SPA, 95–97VSM, 85–86

Word representation, 84

Index 565

Wrapper-based approach, 380WSN. see Wireless sensor network (WSN)

YYelp API

account creation, 517business/restaurant of interest, 519–520consumer key and secret, 518HTML source code, 521–522registers, 518SAS v9.4, 522search string, 518–519“snippet_text” parameter, 521stopwords, 520–521token and token secret, 518

Yelp reviewsanalysis, 516–517correlations, 513fast food experiences, 507

non-franchise and franchise locations,509–512

numeric ratings, 508, 513R programming language

account creation, 517business/restaurant of interest, 519–520consumer key and secret, 518HTML source code, 521–522registers, 518SAS v9.4, 522search string, 518–519“snippet_text” parameter, 521stopwords, 520–521token and token secret, 518

U.S. locations, 513–514word clouds, 515–516

Yelpurl, 518–519

ZZiff-Davis white paper study, 460, 475, 485