- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
What is the fastest way to load data into Apache Hive ACID Tables?
- Labels:
-
Apache Hive
-
Apache NiFi
Created ‎03-14-2018 11:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. A special utility?
2. NiFI: Load Table to ORC with PutHDFS, PutHiveQL Merge with ACID Table
3. SQOOP?
4. NiFI: PutHiveStreaming
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Merge
5. NiFi: To Druid, Insert into hive acid table from table ontop of Druid
6. NiFi to HBase, Hive table on hbase insert into Hive Acid Table
7. Some SnappyData in memory pattern?
8. IBM BigSQL?
9. Attunity?
Created ‎03-15-2018 01:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Timothy Spann the recommended approach is Attunity ---> Kafka --->Nifi---->Hive--->Merge. If you want 100% open source than sqoop the data to a staging area and run merge to get the deltas.
Created ‎03-15-2018 05:36 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would go with either attunity & or some utility/framework which can be modified depending on the use case. These kind of frameworks reduces time and effort. Multiple tables can be processed in parallel with less effort.
Created ‎03-15-2018 12:16 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
that would definitely work, but they are not open source and not free.
Any suggestions for open source?
Created ‎03-16-2018 06:16 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If open source is given importance then I would go with Hive using merge, though I haven't tried with merge with huge volume I believe that it would work decent.
Created ‎03-16-2018 12:02 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks! Merge seems to be recommended by a few sources.
Created ‎03-15-2018 01:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Timothy Spann the recommended approach is Attunity ---> Kafka --->Nifi---->Hive--->Merge. If you want 100% open source than sqoop the data to a staging area and run merge to get the deltas.
Created ‎03-15-2018 01:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Additionally the sqoop/merge process is easily automated using Workflow Manager.
Created ‎03-16-2018 12:02 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
does attunity work with CSV, JSON, XML and other files?
