Support Questions

Find answers, ask questions, and share your expertise

Mapping different header names to hive column names in nifi flow

avatar
New Contributor

Hi,

 

I am trying to see how to map different fields names (for same data) due to data is from various sources, but ingest to same hive table.

Ex:
Table1 with name "Users", Columns {"firstName", "lastName", "street","city","zip","country"}
Table2 with name "UserDetails", Columns {"fName","lName","street","city","pincode","country","contact"}
Table3 with name "User", Columns {"firstName","lastname","address1","address2", "city","state", "pin", "country"}

Target Hive Table: "Users" with columns {"firstName","lastName", "address1", "address2", "city", "state", "zip", "country", "contact"}

firstName, fName maps to firstName
lastName, lName maps to lastName
street and address1 maps to address1
address2 maps to address2
zip, pincode maps to zip
...so on.

What is the best way to map these different column headers to same hive columns in nifi? Appreciate any thoughts. If there is a way I can create only one flow template to handle these that would be great as well.

 

Appreciate your help.

 

Thanks,

Ravi

1 REPLY 1

avatar

As explained elsewhere by Andy:

 

You can accomplish this with a ConvertRecord processor. Register an Avro schema describing the expected format in a Schema Registry (controller service), and create a CSVReader implementation to convert this incoming data to the generic Apache NiFi internal record format. Similarly, use a CSVRecordSetWriter with your output schema to write the data back to CSV in whatever columnar order you like.

 

For more information on the record processing philosophy and some examples, see Record-oriented data with NiFi and Apache NiFi Records and Schema Registries.

 

- Dennis Jaheruddin

If this answer helped, please mark it as 'solved' and/or if it is valuable for future readers please apply 'kudos'.