Upload
tcurdt
View
2.342
Download
6
Tags:
Embed Size (px)
DESCRIPTION
Hadoop has proven to be an invaluable tool for many companies over the past few years. Yet it has it's ways and knowing them up front can safe valuable time. This session is a run down of the ever recurring lessons learned from running various Hadoop clusters in production since version 0.15. What to expect from Hadoop - and what not? How to integrate Hadoop into existing infrastructure? Which data formats to use? What compression? Small files vs big files? Append or not? Essential configuration and operations tips. What about querying all the data? The project, the community and pointers to interesting projects that complement the Hadoop experience.
Citation preview
Hadooplessons learned
@tcurdtgithub.com/tcurdt
yourdailygeekery.com
Data
hiring
Agenda
· hadoop? really? cloud?· integration· mapreduce· operations· community and outlook
Why Hadoop?
“It is a new and improved version of enterprise tape
drive”
20 machines20 files, 1.5 GB each
grep “needle” file
hadoop job grep.jar
0 17.5 35.0 52.5 70.0
unfair
Map Reduce
Run your own?
http://bit.ly/elastic-mr-pig
Integration
black box
· hadoop-cat
· hadoop-grep
· hadoop-range --prefix /logs --from 2012-05-15 --until 2012-05-22 --postfix /*play*.seq | xargs hadoop jar
· streaming jobs
Engineers
· mount hdfs
· pig / hive
· data dumps
Non-Engineering Folks
Map Reduce
InputFormat
HDFS files
Split
Map
Combiner
Partitioner
Copy and Merge
Reducer
OutputFormat
Reducer
Sort
Split
Map
Combiner
Sort
Split
Map
Combiner
Sort
Split
Map
Combiner
Sort
Combiner Combiner
MAPREDUCE-346 (since 2009)
12/05/25 01:27:38 INFO mapred.JobClient: Reduce input records=106..12/05/25 01:27:38 INFO mapred.JobClient: Combine output records=40912/05/25 01:27:38 INFO mapred.JobClient: Map input records=11270584412/05/25 01:27:38 INFO mapred.JobClient: Reduce output records=412/05/25 01:27:38 INFO mapred.JobClient: Combine input records=64842079..12/05/25 01:27:38 INFO mapred.JobClient: Map output records=64841776
map in : 112705844 *********************************map out : 64841776 *****************combine in : 64842079 *****************combine out : 409 |reduce in : 106 |reduce out : 4 |
Job Counters
map in : 20000 **************map out : 40000 ******************************combine in : 40000 ******************************combine out : 10001 ********reduce in : 10001 ********reduce out : 10001 ********
Job Counters
mapred.reduce.tasks = 0
Map-only
public class EofSafeSequenceFileInputFormat<K,V> extends SequenceFileInputFormat<K,V> { ...}
public class EofSafeRecordReader<K,V> extends RecordReader<K,V> { ... public boolean nextKeyValue() throws IOException, InterruptedException { try { return this.delegate.nextKeyValue(); } catch(EOFException e) { return false; } } ...}
EOF on append
ASN1, custom java serialization, Thrift
Serialization
before
now
protobuf
public static class Play extends CustomWritable {
public final LongWritable time = new LongWritable();
public final LongWritable owner_id = new LongWritable();
public final LongWritable track_id = new LongWritable();
public Play() { fields = new WritableComparable[] { owner_id, track_id, time }; }}
Custom Writables
BytesWritable bytes = new BytesWritable();...byte[] buffer = bytes.getBytes();
Fear the State
public void reduce( LongTriple key, Iterable<LongWritable> values, Context ctx) {
for(LongWritable v : values) { } for(LongWritable v : values) { }}
public void reduce( LongTriple key, Iterable<LongWritable> values, Context ctx) { buffer.clear(); for(LongWritable v : values) { buffer.add(v); } for(LongWritable v : buffer.values()) { }}
Re-Iterate
HADOOP-5266 (applied to 0.21.0)
long min = 1;long max = 10000000;
FastBitSet set = new FastBitSet(min, max);
for(long i = min; i<max; i++) { set.set(i);}
BitSets
org.apache.lucene.util.*BitSet
Data Structures
http://bit.ly/data-structureshttp://bit.ly/bloom-filtershttp://bit.ly/stream-lib
General Tips
· test on small datasets, test on your machine
· many reducers
· always consider a combiner and partitioner
· pig / streaming for one-time jobs,java/scala for recurring
http://bit.ly/map-reduce-book
Operations
pdsh -w "hdd[001-019]" \"sudo sv restart /etc/sv/hadoop-tasktracker"
runit / init.d
pdsh / dsh
use chef / puppet
Hardware
· 2x name nodes raid 1
· 12 cores, 48GB RAM, xfs, 2x1TB
· n x data nodes no raid
· 12 cores, 16GB RAM, xfs, 4x2TB
Monitoringdfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31dfs.period=10dfs.servers=...
mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31mapred.period=10mapred.servers=...
jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31jvm.period=10jvm.servers=...
rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31rpc.period=10rpc.servers=...
# ignoreugi.class=org.apache.hadoop.metrics.spi.NullContext
Monitoring
total capacity capacity used
Compression
# of 64MB blocks# of bytes needed# of bytes used# bytes reclaimed
bzip2 / gzip / lzo / snappyio.seqfile.compression.type = BLOCKio.seqfile.compression.blocksize = 512000
Janitor
hadoop-expire -url namenode.here -path /tmp -mtime 7d -delete
The last block of an HDFS block only occupies the required space. So a 4k file only consumes 4k on disk.-- Owen
BUSTED
find \ -wholename "/var/log/hadoop/hadoop-*" \ -wholename "/var/log/hadoop/job_*.xml" \ -wholename "/var/log/hadoop/history/*" \ -wholename "/var/log/hadoop/history/\\.*.crc" \ -wholename "/var/log/hadoop/history/done/*" \ -wholename "/var/log/hadoop/history/done/\\.*.crc" \ -wholename "/var/log/hadoop/userlogs/attempt_*" \ -mtime +7 \ -daystart \ -delete
Logfiles
Limits
hdfs hard nofile 128000hdfs soft nofile 64000mapred hard nofile 128000mapred soft nofile 64000
fs.file-max = 128000
sysctl.conf
limits.conf
Localhost
127.0.0.1 localhost localhost.localdomain127.0.1.1 hdd01
127.0.0.1 localhost localhost.localdomain127.0.1.1 hdd01.some.net hdd01
before
hadoop
Rackaware
<property> <name>topology.script.file.name</name> <value>/path/to/script/location-from-ip</value> <final>true</final></property>
#!/usr/bin/rubylocation = { 'hdd001.some.net' => '/ams/1', '10.20.2.1' => '/ams/1', 'hdd002.some.net' => '/ams/2', '10.20.2.2' => '/ams/2',}
puts ARGV.map { |ip| location[ARGV.first] || '/default-rack' }.join(' ')
site config
topology script
for f in `hdfs hadoop fsck / | grep "Replica placement policy is violated" | awk -F: '{print $1}' | sort | uniq | head -n1000`; do hadoop fs -setrep -w 4 $f hadoop fs -setrep 3 $fdone
Fix the Policy
hadoop fsck / -openforwrite -files | grep -i "OPENFORWRITE: MISSING 1 blocks of total size" | awk '{print $1}' | xargs -L 1 -i hadoop dfs -mv {} /lost+notfound
Fsck
Community
hadoop
* from markmail.org
Community
The Enterprise Effect
“The Community Effect” (in 2011)
Community
mapreduce
core
* from markmail.org
The Future
real timeincremental
flexible pipelinesrefined API
refined implementation
Real Time Datamining and Aggregation at Scale (Ted Dunning)
Eventually Consistent Data Structures (Sean Cribbs)
Real-time Analytics with HBase (Alex Baranau)
Profiling and performance-tuning your Hadoop pipelines (Aaron Beppu)
From Batch to Realtime with Hadoop (Lars George)
Event-Stream Processing with Kafka (Tim Lossen)
Real-/Neartime analysis with Hadoop & VoltDB (Ralf Neeb)
Take Aways
·use hadoop only if you must·really understand the pipeline·unbox the black box
@tcurdtgithub.com/tcurdt
yourdailygeekery.com
That’s it folks!