Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

how to write an oozie shell action to add some metadata tag to a hdfs file?

avatar
 
1 ACCEPTED SOLUTION

avatar
Master Mentor

How do you ingest this data? Can you describe your use case? This task can be easily achieved with Apache Nifi as you process data in-flight. I always found Oozie challenging for this on secure clusters. In your case I'd look at shell action but you'd have to proxy your hdfs user. It would be easier to use Oozie FS action but setfattr action is not supported in current release. Feel free to file an Apache Jira on that. https://oozie.apache.org/docs/4.2.0/WorkflowFunctionalSpec.html#a3.2.4_Fs_HDFS_action

Again, after considering all options in Oozie, I'd try Nifi first.

View solution in original post

4 REPLIES 4

avatar

I need to tag arbitrary hdfs file using below attribute. How can i do it ?? Please share idea on this.

  • supplier = "ABC"
  • reporting_date="21/12/2016"
  • consecutive_number="1234"

avatar
Master Mentor

How do you ingest this data? Can you describe your use case? This task can be easily achieved with Apache Nifi as you process data in-flight. I always found Oozie challenging for this on secure clusters. In your case I'd look at shell action but you'd have to proxy your hdfs user. It would be easier to use Oozie FS action but setfattr action is not supported in current release. Feel free to file an Apache Jira on that. https://oozie.apache.org/docs/4.2.0/WorkflowFunctionalSpec.html#a3.2.4_Fs_HDFS_action

Again, after considering all options in Oozie, I'd try Nifi first.

avatar

Thanks for your reply. Our data ingestion process is little complex- a lots preprocessing needs to be done before tagging incoming hdfs file from external sources. Apache nifi is cool stuff, you suggested almost every possible way to do it. But for this scenario i will go for java action instead shell action because i may need to get the meta-data from different sources that is easy to get using java. Thanks again for sharing thoughts.

avatar
Master Mentor

That also works, I'm going to look at the effort to contribute an enhancement for FS action to add this functionality. Seems only a few FS actions were implemented where so many are available with hdfs shell.