Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

datasets and output input events : what correlation between YEAR/MONTH/DAY and the instance ?

Solved Go to solution
Highlighted

datasets and output input events : what correlation between YEAR/MONTH/DAY and the instance ?

New Contributor

Maybe it is obvious but I was wondering :

When we declare a dataset, based on the date ($YEAR/$MONTH/$DAY/data for example) as an output-events, and used from an input-events where "instance" will watch at current(0) :

Does the dated directory name is directly used to check the input event, or is there a kind of database that register that inside Oozie ? In other words, if we don't mention the output-events and create the "good" directory, will it still working ?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: datasets and output input events : what correlation between YEAR/MONTH/DAY and the instance ?

Yess it will

1. Generally at ingestion stage data is collected at minute, hourly or daily level.

2. To keep data together based on timestamp, one follow "hdfs path" naming convention as /a/b/b/yyyy/mm/dd

3. the job which consumes this data for performing ETL , needs to choose a range of this path like a week , or a month etc hence datasets have YYYY/MM/DD as the variable param in them .

2 REPLIES 2

Re: datasets and output input events : what correlation between YEAR/MONTH/DAY and the instance ?

Yess it will

1. Generally at ingestion stage data is collected at minute, hourly or daily level.

2. To keep data together based on timestamp, one follow "hdfs path" naming convention as /a/b/b/yyyy/mm/dd

3. the job which consumes this data for performing ETL , needs to choose a range of this path like a week , or a month etc hence datasets have YYYY/MM/DD as the variable param in them .

Re: datasets and output input events : what correlation between YEAR/MONTH/DAY and the instance ?

New Contributor

thanks :-)