Created 10-28-2016 09:57 AM
I am evaluating Apache Nifi for moving data to Hive instance in HDP. While moving the data to Hive, I am having a requirement to mask/transform some of the data attributes using a lookup table similar to what can be done in traditional ETL lookup transformations. How can I achieve the same in Nifi ?
Created 11-23-2016 09:31 PM
one easy way to do this is to wrap lookups in a REST API and call it as a step. (InvokeHTTP)
another way is to wrap lookups in a command line call and call it as a step (ExecuteStreamCommand)
Another option is with a custom processor
Another option is to create a custom UDF function in Hive that converts data and then run that.
Another option is to do ETL lookup transformations in Spark, Storm, Flink and call via Site-To-Site or Kafka
Load the lookup values into the DistributedMapCache and use them for replacements
Load lookup tables via SQL
ExecuteScript or ExecuteCommand for looking up data to replace
to pull mappings from a file created from your tables.
Lookup Table Service
Use HBase for your lookups
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.hbase.GetHBase/
Created 11-23-2016 09:35 AM
did you find anything on this?
Created 11-23-2016 09:31 PM
one easy way to do this is to wrap lookups in a REST API and call it as a step. (InvokeHTTP)
another way is to wrap lookups in a command line call and call it as a step (ExecuteStreamCommand)
Another option is with a custom processor
Another option is to create a custom UDF function in Hive that converts data and then run that.
Another option is to do ETL lookup transformations in Spark, Storm, Flink and call via Site-To-Site or Kafka
Load the lookup values into the DistributedMapCache and use them for replacements
Load lookup tables via SQL
ExecuteScript or ExecuteCommand for looking up data to replace
to pull mappings from a file created from your tables.
Lookup Table Service
Use HBase for your lookups
https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.hbase.GetHBase/
Created 12-02-2016 09:40 AM
Hi Timothy,
I am trying to Lookup with Cache method.
Load the lookup values into the DistributedMapCache and use them for replacements
-----------
It doesnt seem to be working for me.
When i try to compare values between 2 flows.. it doesnt compare them.
I have a RouteOnAttribute which uses an Expression like this: ${DEPT_NO:equals(${LKP_DEPT_NO})}
It doesnt send out anything. I checked the UpStream queues. They have correct values.
Can you please suggest how to compare the incoming attributes from 2 flows?
Created 12-05-2016 09:44 AM
I tried one more time and it worked for me.. thanks!