- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
NIFI: SelectHiveQL vs QueryDatabaseTable
- Labels:
-
Apache NiFi
Created 02-28-2017 02:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I want to query Hive for only changed records ala QueryDatabaseTable.
Should I use QueryDatabaseTable or is there a way to make SelectHiveQL maintain current state?
Created 02-28-2017 02:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would go with querydatabasetable. this will provide you state and also another important feature to break up return records into flow files. for example if 1000 records are expected to be a output of query, you can set Max Rows Per Flow File to x, and process data in smaller chunks. if you use selecthiveql, then build your query using update attribute and use a state via distributed map cache (DMC). maintain the last state in DMC and use that state in your updateattribute to run query in selecthiveql.
Flow
DMC fetch (state field) --> update attribute (build query) --> selecthiveql
You will have to set dmc to initial state value or set in logic via update attribute.
Created 02-28-2017 02:42 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would go with querydatabasetable. this will provide you state and also another important feature to break up return records into flow files. for example if 1000 records are expected to be a output of query, you can set Max Rows Per Flow File to x, and process data in smaller chunks. if you use selecthiveql, then build your query using update attribute and use a state via distributed map cache (DMC). maintain the last state in DMC and use that state in your updateattribute to run query in selecthiveql.
Flow
DMC fetch (state field) --> update attribute (build query) --> selecthiveql
You will have to set dmc to initial state value or set in logic via update attribute.
Created 10-12-2020 02:11 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Could you let me know if you have managed to query HIVE table using QueryDatabaseTableRecord processor as I am having issue in doing same please.
Created 10-12-2020 10:43 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Kumar78, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question.
Regards,
Vidya Sargur,Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
