About ramym

ramym · ‎10-05-2016

Thanks @Rajeshbabu Chintaguntla

ramym · ‎10-03-2016

Bump. Any recommendations to this above question? We are also looking for such a solution.

ramym · ‎10-03-2016

@Rajeshbabu Chintaguntla Thanks for that detailed post, there seems to be two really good approaches there. Which approach would likely provide better performance? It seems like the CsvBulkLoadTool might better than ImportTsv but wanted to verify.

ramym · ‎10-03-2016

@Constantin Stanca Thanks for the insight. Based on your comment, does Phoenix chunk the data automically if we ingest it through it?

ramym · ‎09-29-2016

We have a 250GB CSV file that contains 60 Million records and roughly 600 columns. The file lives within HDFS currently and we are trying to ingest it into HBase and have a phoenix table on top of it. The approach we tried so far was to create a Hive table backed by HBase and then execute an overwrite command in Hive which ingests the data in HBase. The biggest problem we have is that the job currently takes about 3-4 days to run!! this is running on a 10 node cluster with medium spec cluster (30GB of RAM each node, and 2TB on each). Any advice on how to speed this up or different methods that can be more efficient?

Online	Offline
Last Visited	‎02-21-2020 04:08 PM

Member Since	‎08-20-2018 08:59 AM
Last Visited	‎02-21-2020 04:08 PM
Posts	7

Cloudera Community

Re: Hbase data ingestion

Re: Hide HDFS folder from other AMbari-User

Re: Hbase data ingestion

Re: Hbase data ingestion

Hbase data ingestion