About GautamG

GautamG · ‎03-05-2015

Happy to help!

Spinhoo · ‎03-05-2015

I figured why the last reducer is taking so long - User error (its me!)... When I presplit the table based on target regions, I missed to include all the keys. This resulted in a table with last key being responsible for 80 times more data than other regions. This is what caused that reducer to spend so much amount of time. If he table is split evenly all reducers seem to be finishing close to each other.

Navneet346 · ‎03-01-2015

Hi Gautam My issue is I think with filled space , after troubleshooting i found i was earlier using /dfs/dn for HDFS block storage , later i added non OS partition under /home (/home/hdfs/dfs/dn) and then started importing 100 of GB data. Looks like some how my old path /dfs/dn had also sotred some of HDFS blocks and filled that root partition. Sais so if now by chnagging the configuration remove (/dfs/dn) dfs.data.dir and restart cluster will it do automatic move data to only left location /home/hdfs/dfs/dn or how to handle that. I guess this will fix my problem for now. Do not worry about data much what ever best and quick will be fine . [root@hadoop-vm2 /]# du -sh ./* 7.9M ./bin 61M ./boot 4.0K ./cgroup 196K ./dev 40G ./dfs 30M ./etc 55G ./home 12K ./impala 263M ./lib 27M ./lib64 16K ./lost+found 4.0K ./media 0 ./misc 4.0K ./mnt 0 ./net 3.5G ./opt du: cannot access `./proc/20676/task/20676/fd/4': No such file or directory du: cannot access `./proc/20676/task/20676/fdinfo/4': No such file or directory du: cannot access `./proc/20676/fd/4': No such file or directory du: cannot access `./proc/20676/fdinfo/4': No such file or directory 0 ./proc 92K ./root 15M ./sbin 4.0K ./selinux 4.0K ./srv 0 ./sys 1.2M ./tmp 2.9G ./usr 387M ./var 223M ./yarn [root@hadoop-vm2 /]# ls /dfs/dn/current/ BP-1505211549-172.28.172.30-1424252944658 VERSION

GautamG · ‎02-24-2015

Ah, my bad. Darren caught it. Cloudera Manager will write zoo.cfg to /var/run/cloudera-scm-agent/process/ (look for the latest director for zookeeper)

tarekabouzeid91 · ‎02-17-2015

okey i will try this thanks so much , but i have another question if you please, i add the external jars i need , all of them work normally , but the " org.apache.hadoop.hive.conf.HiveConf" which exists in hive-common-0.13.1-cdh5.3.0.jar , it gives the error " calss not found " so why that happens ? the command i run : sudo spark-submit --class "WordCount" --master local[*] --jars /usr/local/WordCount/target/scala-2.10/spark-streaming-flume_2.11-1.2.0.jar,/usr/lib/avro/avro-ipc-1.7.6-cdh5.3.0.jar,/usr/lib/flume-ng/lib/flume-ng-sdk-1.5.0-cdh5.3.0.jar,/usr/lib/hive/lib/hive-common-0.13.1-cdh5.3.0.jar,/usr/local/WordCount/target/scala-2.10/spark-hive_2.10-1.2.0-cdh5.3.0.jar /usr/local/WordCount/target/scala-2.10/wordcount_2.10-1.0.jar 127.0.0.1 9999

GautamG · ‎02-16-2015

You can try following the instructions in this page: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html

mtowse · ‎02-11-2015

Thanks for the reply GautamG My colleague has a resolved this by restarting all the services on the machine. I had tried just restarting HBase but that hadn't worked. Cheers Mark

GautamG · ‎02-10-2015

The officially supported alternative for you would be the Path C install (tarballs). RPM however does support the --relocate option however we have not tested the CDH/CM packages with this, so cannot recommend you use it.

GautamG · ‎02-06-2015

Thanks for the feedback. Glad it's all well now

busbey · ‎02-03-2015

Each file uses a minimum of one block entry (though that block will only be the size of the actual data). So if you are adding 2736 folders each with 200 files that's 2736 * 200 = 547,200 blocks. Do the folders represent some particular partitioning strategy? Can the files within a particular folder be combined into a single larger file? Depending on your source data format, you may be better off looking at something like Kite to handle the dataset management for you.

Online	Offline
Last Visited	‎10-19-2025 08:46 PM

Member Since	‎01-20-2014 12:14 AM
Last Visited	‎10-19-2025 08:46 PM
Posts	578
Kudos received	102

Cloudera Community

Re: BDR Throwing Error : Hive Table does not match...

Re: parcel usages (Active, 0)

Re: Upgrading to CentOS 6.7... what version of CDH...

Re: 1 of the 3 node Zookeeper quorum failed, how t...

Re: Parcel distro suffixes

Re: Cloudera Manager hangs when Installing Selecte...

Re: Hbase bulk load help, the last reducer is taki...

Re: issue with Log Directory Free Space , This rol...

Re: zookeeper datadir name configuration

Re: Hive Context in CDH 5.3.x

Re: Error when distributing to xxx.xxx.xxx.xxx: Un...

Re: Failed binding http info server to port: 60030

Re: Error when installing cloudera Manager. Need m...

Re: Autodeployment with CM Manager 5.3.1 issue

Re: Datanode shut down when running Hive