Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Can Nifi be used for complete analytics project flows??

Can Nifi be used for complete analytics project flows??

New Contributor

hi Team,

we have a standard analytics flow, i am trying to explore can i use NIFI instead to make it easy. current layers are as follows

1. rdbms->base layer(hdfs)

2. base(hdfs)->intermediate layer(run hive queries and create another table)

3. intermediate layer - > R MODEL LAYER(run R scripts on intermediate tables)

4. RMODEL Layer -> reporting layer(again some hive queries are run on this)

i want to know if i can use Nifi to execute the above flow and schedule it even.

2 REPLIES 2

Re: Can Nifi be used for complete analytics project flows??

Super Guru

Nifi can do all of that, schedule it or have it run real-time.

If you search under articles you will see most of those.

Are they stand alone R scripts? if so you can run them via executeprocess

If they are spark-r you can run them via execute spark or via kafka.

In R Studio you can make hive queries.

NiFi can read rdbms tables and write to hdfs

https://community.hortonworks.com/articles/108718/ingesting-rdbms-data-as-new-tables-arrive-automagi...

https://community.hortonworks.com/articles/64122/incrementally-streaming-rdbms-data-to-your-hadoop.h...

Re: Can Nifi be used for complete analytics project flows??

Contributor

Nifi can absolutely do this, but you may want to look at skipping the hdfs layer and going directly to a hive managed orc table from the rdbms step:

rdbms --(via data in Nifi flow file)--> orc table --> R Model Layer -> model output hive tables?

the reporting layer should probably independent and access the hive layer on it's own?

Don't have an account?
Coming from Hortonworks? Activate your account here