Member since
01-20-2014
578
Posts
102
Kudos Received
94
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5951 | 10-28-2015 10:28 PM | |
2894 | 10-10-2015 08:30 PM | |
4941 | 10-10-2015 08:02 PM | |
3713 | 10-07-2015 02:38 PM | |
2481 | 10-06-2015 01:24 AM |
03-05-2015
01:45 PM
Happy to help!
... View more
03-05-2015
11:14 AM
I figured why the last reducer is taking so long - User error (its me!)... When I presplit the table based on target regions, I missed to include all the keys. This resulted in a table with last key being responsible for 80 times more data than other regions. This is what caused that reducer to spend so much amount of time. If he table is split evenly all reducers seem to be finishing close to each other.
... View more
03-01-2015
10:34 PM
1 Kudo
Hi Gautam My issue is I think with filled space , after troubleshooting i found i was earlier using /dfs/dn for HDFS block storage , later i added non OS partition under /home (/home/hdfs/dfs/dn) and then started importing 100 of GB data. Looks like some how my old path /dfs/dn had also sotred some of HDFS blocks and filled that root partition. Sais so if now by chnagging the configuration remove (/dfs/dn) dfs.data.dir and restart cluster will it do automatic move data to only left location /home/hdfs/dfs/dn or how to handle that. I guess this will fix my problem for now. Do not worry about data much what ever best and quick will be fine . [root@hadoop-vm2 /]# du -sh ./* 7.9M ./bin 61M ./boot 4.0K ./cgroup 196K ./dev 40G ./dfs 30M ./etc 55G ./home 12K ./impala 263M ./lib 27M ./lib64 16K ./lost+found 4.0K ./media 0 ./misc 4.0K ./mnt 0 ./net 3.5G ./opt du: cannot access `./proc/20676/task/20676/fd/4': No such file or directory du: cannot access `./proc/20676/task/20676/fdinfo/4': No such file or directory du: cannot access `./proc/20676/fd/4': No such file or directory du: cannot access `./proc/20676/fdinfo/4': No such file or directory 0 ./proc 92K ./root 15M ./sbin 4.0K ./selinux 4.0K ./srv 0 ./sys 1.2M ./tmp 2.9G ./usr 387M ./var 223M ./yarn [root@hadoop-vm2 /]# ls /dfs/dn/current/ BP-1505211549-172.28.172.30-1424252944658 VERSION
... View more
02-24-2015
04:23 PM
1 Kudo
Ah, my bad. Darren caught it. Cloudera Manager will write zoo.cfg to /var/run/cloudera-scm-agent/process/ (look for the latest director for zookeeper)
... View more
02-17-2015
04:34 AM
okey i will try this thanks so much , but i have another question if you please, i add the external jars i need , all of them work normally , but the " org.apache.hadoop.hive.conf.HiveConf" which exists in hive-common-0.13.1-cdh5.3.0.jar , it gives the error " calss not found " so why that happens ? the command i run : sudo spark-submit --class "WordCount" --master local[*] --jars /usr/local/WordCount/target/scala-2.10/spark-streaming-flume_2.11-1.2.0.jar,/usr/lib/avro/avro-ipc-1.7.6-cdh5.3.0.jar,/usr/lib/flume-ng/lib/flume-ng-sdk-1.5.0-cdh5.3.0.jar,/usr/lib/hive/lib/hive-common-0.13.1-cdh5.3.0.jar,/usr/local/WordCount/target/scala-2.10/spark-hive_2.10-1.2.0-cdh5.3.0.jar /usr/local/WordCount/target/scala-2.10/wordcount_2.10-1.0.jar 127.0.0.1 9999
... View more
02-16-2015
09:43 PM
You can try following the instructions in this page: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html
... View more
02-11-2015
08:28 AM
1 Kudo
Thanks for the reply GautamG My colleague has a resolved this by restarting all the services on the machine. I had tried just restarting HBase but that hadn't worked. Cheers Mark
... View more
02-10-2015
04:26 PM
The officially supported alternative for you would be the Path C install (tarballs). RPM however does support the --relocate option however we have not tested the CDH/CM packages with this, so cannot recommend you use it.
... View more
02-03-2015
07:58 AM
1 Kudo
Each file uses a minimum of one block entry (though that block will only be the size of the actual data). So if you are adding 2736 folders each with 200 files that's 2736 * 200 = 547,200 blocks. Do the folders represent some particular partitioning strategy? Can the files within a particular folder be combined into a single larger file? Depending on your source data format, you may be better off looking at something like Kite to handle the dataset management for you.
... View more