apache oozie workflow scheduler for hadoop overview oozie is a workflow scheduler system to manage apache hadoop jobs oozie workflow jobs are directed acyclical graphs dags of actions
2 x hbase 1 3 x hbase 2 0 x hadoop 2 0 x alpha x x x x hadoop 2 1 0 beta x x x x hadoop 2 2 0 nt x x x hadoop 2 3 x nt x x x hadoop 2 4 x s s s x hadoop 2 5 x s s s x hadoop 2 6 0 x x x x hadoop 2 6 1 nt s s s hadoop 2 7 0 x x x x hadoop 2 7 1 nt
an ec2 instance that runs hadoop map and reduce tasks and stores data using the hadoop distributed file system hdfs core nodes are managed by the master node which assigns hadoop tasks to nodes and monitors their status
for the open source apache hadoop software package originally developed by companies like yahoo and facebook hadoop is now used widely in finance telecom media and government verticals cloudera builds upon hadoop to create a unique data