Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

NiFi PutHive3Streaming and Hive's minor compaction

Solved Go to solution

NiFi PutHive3Streaming and Hive's minor compaction

New Contributor

Datas which are inserted into Hive by INSERT INTO command are gotten minor compaction.
But the data inserted into Hive using NiFi's PutHive3Streaming aren't gotten minor compaction even if there is enough deltas.
Is it possible to make minor compaction(NiFi) work?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: NiFi PutHive3Streaming and Hive's minor compaction

Super Guru
@Kei Miyauchi

If you want to trigger minor/major compactions from NiFi then feed the Success relationship PutHiveStreaming processor to Replace Text process and configure ReplaceText processor with below configs:

replacement strategy

always replace  

and replacement value as

alter table <db_name>.<table_name> compact 'minor';

Then using PutHiveQL processor execute the minor compaction.

Flow:

--other processors
--> PutHivestreaming 
--> ReplaceText processor
--> PutHiveQL

By following this way we are initializing minor compaction from NiFi.

Take a look into this SupportKB related to Minor Compactions are not working in Hive, set the recommended global configs to make minor compactions work.

4 REPLIES 4

Re: NiFi PutHive3Streaming and Hive's minor compaction

Super Guru
@Kei Miyauchi

If you want to trigger minor/major compactions from NiFi then feed the Success relationship PutHiveStreaming processor to Replace Text process and configure ReplaceText processor with below configs:

replacement strategy

always replace  

and replacement value as

alter table <db_name>.<table_name> compact 'minor';

Then using PutHiveQL processor execute the minor compaction.

Flow:

--other processors
--> PutHivestreaming 
--> ReplaceText processor
--> PutHiveQL

By following this way we are initializing minor compaction from NiFi.

Take a look into this SupportKB related to Minor Compactions are not working in Hive, set the recommended global configs to make minor compactions work.

Re: NiFi PutHive3Streaming and Hive's minor compaction

New Contributor

@Shu

Thank you, I'll try that.
The link isn't working. You meant this page?
https://community.hortonworks.com/content/supportkb/193756/automatic-minor-compaction-on-hive-is-not...

And, do you have any idea to trigger ReplaceText and PutHiveQL only after some flowfile passed PutHiveStreaming?
I think invoking minor compaction for each flowfile is too much when a lot of flowfile comes.

Re: NiFi PutHive3Streaming and Hive's minor compaction

Super Guru

@Kei Miyauchi

Yes, i meant https://community.hortonworks.com/content/supportkb/193756/automatic-minor-compaction-on-hive-is-not... this page.

Use merge content processor after PutHiveStreaming and configure the processor to wait for minimum 10 flowfiles (or some other number) and merge them into one then feed the merged relation to ReplaceText processor, by using merge content processor we are going to wait for atleast 10 flowfiles and then triggering minor compaction.

Flow:

--other processors
-->PutHivestreaming
--> MergeContent
-->ReplaceText processor
-->PutHiveQL

Re: NiFi PutHive3Streaming and Hive's minor compaction

New Contributor

@Shu

That works!

Thank you for all you replies.