Upload
mislam77
View
671
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Oozie is a Scheduler for Apache Hadoop jobs.
Citation preview
Oozie Evolution Gateway to Hadoop Eco-System
Mohammad Islam
Agenda
• What is Oozie? • What is in the Next Release? • Challenges • Future Works • Q & A
Oozie O
ozie
Oozie in Hadoop Eco-System
HDFS
Map-Reduce
HC
atalog
Pig Sqoop Hive
Oozie : The Conductor
A Workflow Engine
• Oozie executes workflow defined as DAG of jobs • The job type includes: Map-Reduce/Pig/Hive/Any script/
Custom Java Code etc
start M/R job
M/R streaming
job
decision
fork
Pig job
M/R job
join
end Java FS job
ENOUGH
MORE
A Scheduler
• Oozie executes workflow based on: – Time Dependency (Frequency) – Data Dependency
Hadoop
Oozie Server
Oozie Client
Oozie Workflow
WS API Oozie Coordinator
Check Data Availability
REST-API for Hadoop Components
• Direct access to Hadoop components – Emulates the command line through REST
API. • Supported Products:
– Pig – Map Reduce
Three Questions … Do you need Oozie?
If any one of your answers is YES, then you should consider Oozie!
Q3 : Do you need monitoring and operational support for your jobs?
Q2 : Does your job start based on time or data availability?
Q1 : Do you have multiple jobs with dependency?
What Oozie is NOT
• Oozie is not a resource scheduler
• Oozie is not for off-grid scheduling o Note: Off-grid execution is possible through SSH action.
• If you want to submit your job occasionally, Oozie is an option.
o Oozie provides REST API based submission.
Oozie in Apache
Main Contributors
Oozie in Apache
• Y! internal usages: – Total number of user : 375 – Total number of processed jobs ≈ 750K/
month • External downloads:
– 2500+ in last year from GitHub – A large number of downloads maintained by
3rd party packaging.
Oozie Usages Contd.
• User Community: – Membership
• Y! internal - 286 • External – 163
– Message (approximate) • Y! internal – 7/day • External – 8/day
Next Release …
• Integration with Hadoop 0.23
• HCatalog integration – Non-polling approach
Usability
• Script Action • Distcp Action • Suspend Action • Mini-Oozie for CI
– Like Mini-cluster • Support multiple versions
– Pig, Distcp, Hive etc.
Reliability
• Auto-Retry in WF Action level
• High-Availability – Hot-Warm through ZooKeeper
Manageability
• Email action
• Query Pig Stats/Hadoop Counters – Runtime control of Workflow based on stats – Application-level control using the stats
Challenges : Queue Starvation
• Which Queue? – Not a Hadoop queue issue. – Oozie internal queue to process the Oozie
sub-tasks. – Oozie’s main execution engine.
• User Problem : – Job’s kill/suspend takes very long time.
Challenges : Queue Starvation
Technical Problem: • Before execution, every task acquires lock on the job id.
In Queue
• Special high-priority tasks (such as Kill or Suspend) couldn’t get the lock and therefore, starve.
J2 J1(H) J2 J1
J1 J2
Starvation for High Priority Task!
J1 J1
J1
Challenges : Queue Starvation
Resolution:
In Queue
J2 J1(H) J2 J1
J1 J2
J1
• Add the high priority task in both the interrupt list and normal queue. • Before de-queue, check if there is any task in the interrupt list for the same job id. If there is one, execute that first.
In Interrupt List
J1(H)
finds a task in interrupt queue
Oozie Futures
• Easy adoption – Modeling tool – IDE integration – Modular Configurations
• Allow job notification through JMS • Event-based data processing • Prioritization
– By user, system level.
Take Away ..
• Oozie is – In Apache! – Reliable and feature-rich. – Growing fast.
Who needs Oozie?
• Multiple jobs that have sequential/conditional/parallel dependency
• Need to run job/Workflow periodically. • Need to launch job when data is available. • Operational requirements:
– Easy monitoring – Reprocessing – Catch-up
T1 T1 T2 T1 T1 T1 T1 T2
Challenges : Queue Starvation Problem:
• Consider queue with tasks of type T1 and T2. Max Concurrency = 2.
In Queue Running C (T1) C (T2)
2 1
• Over-provisioned task (marked by red) is pushed back to the queue. • At high load, it gets penalized in favor of same type, but later arrival of tasks .
Starvation!
0 1 0
T1 cannot execute and is pushed to head of queue
T1 cannot execute, so skip by one node to front
Challenges : Queue Starvation
• Before de-queuing any task, check its concurrency. • If violated, skip and get the next task.
T1 T2 T1 T1 T1 T1 T2
In Queue Running C (T1) C (T2)
2 1 0 1 0
T1 now executes normally
Resolution:
Enqueue T2 now
2