Code Repositories

Find and share code repositories
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.
Repo Description

As part of this flow, we will ingest data files, that are copied to the landing zone on a gateway server, and then process them at a regular interval automatically using Falcon. When the workflow begins, the files are ingested, stored, transformed and the transformed data is sqooped out of cluster into a MySQL database.

Once the data is processed, the hive processing lineage will be available in Apache Atlas.

Repo Info
Github Repo URL https://github.com/sainib/hadoop-data-pipeline
Github account name sainib
Repo name hadoop-data-pipeline
1,624 Views
Comments
Contributor

Is there any example with Kerberos enabled cluster?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.
Version history
Last update:
‎12-04-2015 06:34 PM
Updated by:
Contributors