1. Oozie

   - workflow scheduler system to run apache hadoop job.

   - workflow jobs is DAG (Directed Acyclic Graphs)



   1) workflow : what to do

      


   2) coordinator : when to do a task



   3) bundles : what all things to do together as a group



2. Oozie languge 

   1) basic el constants

       - KB : 1 kilobyte

       - MB : 1 megabyte

       - GB : 1 gigabyte

       - TB : 1 terabyte

       - PB : 1 petabyte

 

   2) basic el functions

       - String timestamp() : iso8601 format

       - String trim(String s) : trimmed value


   3) workflow el functions

       - String wf:id() : gives the workflow job id

       - String wf:name() : gives workflow application name

       - String wf:lastErrorNode() : gives the name of the last workflow action

       - String wf:errorMessage(String message) : gives the error mesage


   4) hadoop el constants

       - records 

       - map_in

       - map_out

       - reduce_in

       - reduce_out

       - groups


   5) hdfs el functions

       path format : hdfs://test:8020/tmp/t1

       - boolean fs:exists(String path) 

       - boolean fs:isDir(String path)

       - long fs:dirSize(String path)

       - long fs:fileSize(String path)


3. oozie 사용

   1) oozie job -run -oozie http://localhost:11000/oozie -config commandline/job.properties

   2) oozie job -info <id>


Workflow states


prep : job이 server summit 되고 아직 되지 않고 있는 상태

running : job이 수행시점

suspended : job이 중단된 상태

succeeded : job이 완료된 상태

killed : admin이 created, running, suspended된 job이 killed된 상태

failed : job이 error 발생한 상태



4. 예제

1) mapreduce job

jobTracker=test1:8050

nameNode=hdfs://test1:8020

oozie.use.system.libpath=True

oozie.wf.application.path=hdfs://test1:8020/tmp/mapreduce

oozie.libpath=hdfs://test1:8020/user/oozie/share/lib

input=/tmp/input/

output=/tmp/output


oozie job -run -oozie http://localhost:11000/oozie -config job.properties



oozie mapreduce -config job.properties -oozie http://localhost:11000/oozie


2) hive job


load data inpath '/tmp/test1' into table test

oozie hive -config job.prpperties -file insert.sql --oozie http://localhost:11000/oozie


3) sqoop job


import --connect jdbc:mysql://localhost/database --username sqoop --password sqoop --table  tablen --hive-import --hive-table tablen


'기타' 카테고리의 다른 글

[인증서] java keystore 인증서  (0) 2017.03.23
[인증서] openssl 인증서  (0) 2017.03.23
hadoop 보안  (0) 2016.10.25
kerberos 구조  (0) 2016.10.23
kerberos keytab  (0) 2016.10.22

+ Recent posts