Upload
terry-padgett
View
346
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Presentation on the new, beta, release of the Hortonworks Data Platform for Windows
Citation preview
© Hortonworks Inc. 2013
Hadoop for Windows
Terry Padgett
Solution Engineer, Hortonworks
28 February 2013
© Hortonworks Inc. 2013
Why Apache Hadoop on Windows?
• About 70% of the Earth’s surface is covered by water… so we build ships
• About 72% of servers run Microsoft Windows… so we build Hadoop for Windows
© Hortonworks Inc. 2013
What’s in the box?
Initial Hadoop Core Apache JIRA:
https://issues.apache.org/jira/browse/HADOOP-8079
Apache Hadoop branch-1-win github:
https://github.com/apache/hadoop-common/tree/branch-1-win
Apache Hadoop branch-trunk-win github:
https://github.com/apache/hadoop-common/tree/branch-trunk-win
Component Version Patches
Hadoop 1.0.3 119
Pig 0.9.3 19
Hive 0.9.0 12
HCatalog 0.4.1 2
WebHCat 0.1.4 None
Sqoop 1.4.2 None
Oozie 3.2.0 None
© Hortonworks Inc. 2013
What’s changed?
- Command-line scripts for the Hadoop surface area- HDFS permissions model mapped to Windows- Resolved issues with path semantics between Java and Windows- Native Task Controller for Windows- Implementation of a Block Placement Policy to support cloud
environments, more specifically Azure- Implementation of Hadoop native libraries for Windows (compression
codecs, native I/O)- Resolved several reliability issues- Several new unit test cases written for the above changes
© Hortonworks Inc. 2013
What do you get?
• More deployment choices
• Hadoop for Windows is for on-premise deployment– Good fit for organizations with Hadoop operational experience– Next step for those who are ready to move from POC to production
• Use HDInsight for public and private cloud deployments– HDInsight Service -> Windows Azure– available in Preview today– HDInsight Server -> for interoperability across platforms of Hadoop, with Microsoft
tools, on premise – Developer Preview available today
• Full interoperability across platforms
• Created through partnership between Hortonworks and Microsoft– Eighteen months of development time
Page 5
© Hortonworks Inc. 2013
Installation Tidbits
• Prereqs–Microsoft Visual C++ 2010 Redistributable Package (64 bit)–Microsoft.NET Framework 4.0– JDK 6u31 or higher
– For the love of God do not install in a directory path containing a space
–Python 2.7.3– Add the installation directory to path
–Hive metastore– Embedded Derby is provided– Alternately, using SQL Server requires table and user set up and
SQL Server JDBC Driver
–Server time must be in sync–Enable remote scripting–Ports
© Hortonworks Inc. 2013
Installation
• MSI installer executed on each host• Cluster configuration file assigns node roles# Sample clusterproperties.txt file#Log directoryHDP_LOG_DIR=d:\hadoop\logs
#Data directoryHDP_DATA_DIR=d:\hdp\data
#HostsNAMENODE_HOST=NAMENODE_MASTER.acme.comSECONDARY_NAMENODE_HOST=SECONDARY_NAMENODE_MASTER.acme.comJOBTRACKER_HOST=JOBTRACKER_MASTER.acme.comHIVE_SERVER_HOST=HIVE_SERVER_MASTER.acme.comOOZIE_SERVER_HOST=OOZIE_SERVER_MASTER.acme.comTEMPLETON_HOST=TEMPLETON_MASTER.acme.comSLAVE_HOSTS=slave1.acme.com, slave2.acme.com, slave3.acme.com
#Database hostDB_FLAVOR=derbyDB_HOSTNAME=DB_myHostName
#Hive propertiesHIVE_DB_NAME=hiveHIVE_DB_USERNAME=hiveHIVE_DB_PASSWORD=hive
#Oozie propertiesOOZIE_DB_NAME=oozieOOZIE_DB_USERNAME=oozieOOZIE_DB_PASSWORD=oozie
© Hortonworks Inc. 2013
Cluster Management
Start/Stop via Services Administration Tool
© Hortonworks Inc. 2013
CLI Consistency
© Hortonworks Inc. 2013
Our Old Friends, Still Here
© Hortonworks Inc. 2013
What’s next
• HDP for Windows will be GA in Q2• Eventual alignment with other Hortonworks distributions
• Contact Microsoft for HDInsight information
Questions?