Support Questions

Find answers, ask questions, and share your expertise

Workflow Portion of CCP: Data Engineer Exam

avatar
Contributor

The Workflow portion of the exam has the following expectations:

 

The ability to create and execute various jobs and actions that move data towards greater value and use in a system.

 

This includes the following skills:

 

  • Create and execute a linear workflow with actions that include Hadoop jobs, Hive jobs, Pig jobs, custom actions, etc.
  • Create and execute a branching workflow with actions that include Hadoop jobs, Hive jobs, Pig jobs, custom action, etc.
  • Orchestrate a workflow to execute regularly at predefined times, including workflows that have data dependencies.

Would it be acceptable if we use a combination of bash scripts and cronjobs for this portion?

1 ACCEPTED SOLUTION

avatar
Rising Star

If a question does not specify how to perform the task (which most don't), then any solution that achieves the desired result is acceptable.  In some cases, however, problems may require you to work with specific technologies, such as placing data into a table in the Hive metastore or building an Oozie workflow.  You would be best advised to be familiar with both approaches.

 

Devon

View solution in original post

1 REPLY 1

avatar
Rising Star

If a question does not specify how to perform the task (which most don't), then any solution that achieves the desired result is acceptable.  In some cases, however, problems may require you to work with specific technologies, such as placing data into a table in the Hive metastore or building an Oozie workflow.  You would be best advised to be familiar with both approaches.

 

Devon