Member since
07-31-2013
1924
Posts
462
Kudos Received
311
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2129 | 07-09-2019 12:53 AM | |
| 12449 | 06-23-2019 08:37 PM | |
| 9561 | 06-18-2019 11:28 PM | |
| 10526 | 05-23-2019 08:46 PM | |
| 4895 | 05-20-2019 01:14 AM |
11-02-2017
04:40 AM
1 Kudo
You can pass an input directory to the ImportTSV tool, where your directory can carry any number of files. It is not limited to a single file unless you pass a single file (instead of a directory) to it.
... View more
11-02-2017
04:39 AM
1 Kudo
You are right that its all just byte sequences to HBase, and that it sorts everything lexicographically. You do not require a separator character when composing your key for HBase to understand them as boundaries (cause it would not serve as one), unless you prefer the extra bytes for better readability or for recovering back the individual data elements from (variable length) keys if that's a use-case. HBase 'sharding' (splitting) can be manually specified at table create time if you are aware of your key pattern and ranges - this is strongly recommended to scale from the beginning. Otherwise, HBase computes key midpoints by analysing them in byte form and splits them based on that, whenever a split size threshold is reached for a region range.
... View more
10-24-2017
12:12 AM
There is a simple method to remove those. 1. List those directories inside a txt file like below hadoop fs -ls /path > test 2. cat -t test will give you positions of duplicate with junk character 3. open another shell and just try to comment it # to identify exact ones 4. again cat -t the file to confirm u commented the culprits 5. remove original folder frm list 6. for i in `cat list`; do hadoop fs -rmr $i; done
... View more
10-23-2017
10:47 PM
@Harsh J, Thanks for quick reply. I thought the ouptut of fsck command includes replica multiplier and gives final total block count. Thanks for the clarification. I checked Datanodes page on namenode WebUI and block count for each datanode is more than threshold value. Thanks, Priya
... View more
10-20-2017
01:15 AM
You can search it from the console using command $ locate *hive-hcatalog-core*.jar
... View more
10-19-2017
01:00 PM
Thanks so much for the help. That worked. I was able to get the backup of the fsimage.
... View more
10-15-2017
09:18 PM
Currently the MapReduceIndexerTool appears to hardcode the job names, so it does not appear configurable: https://github.com/cloudera/search/blob/cdh5.13.0-release/search-mr/src/main/java/org/apache/solr/hadoop/MapReduceIndexerTool.java#L812 (and other such setJobName calls in the driver).
... View more
10-14-2017
11:19 AM
Hi , I am also getting same error message in the namenode logs. I have tried the below solution but in my case seen_txid file present only in the folder tmp/hadoop-root/dfs/name/current. Any other solution?
... View more
09-21-2017
12:24 AM
Reading through this blog helped me a lot: http://blog.cloudera.com/blog/2014/11/guidelines-for-installing-cdh-packages-on-unsupported-operating-systems/ Specially understanding that this can be a package mix/match issue because of trying to run YARN on a unsupported version (Ubuntu 16.04). I removed package and reinstalled like this: $> sudo apt-get remove zookeeper this will remove a bunch of stuff along with zookeeper $> sudo apt-get install hadoop-yarn-resourcemanager this will reinstall resource manager for you. Hope this helps!
... View more
09-19-2017
12:16 PM
As I stated in my recent comment, the flume kafka client was upgraded as a part of the CDH5.8 upgrade to be able to use the new consumer API, which supports secure communication with kerberos. Versions prior to CDH5.8 use the old api which doesn't support kerberos or SSL. You will have to upgrade to get this new functionality, or run flume outside of Cloudera Manager, using tarballs or RPM's. -pd
... View more