About wazzu62

Tylenol · ‎06-15-2022

ah ! Can you try to run the below HDFS balancer command , The below command would move the blocks at a decent pace and would not affect the existing jobs nohup hdfs balancer -Ddfs.balancer.moverThreads=5000 -Ddfs.datanode.balance.max.concurrent.moves=20 -Ddfs.datanode.balance.bandwidthPerSec=10737418240 -Ddfs.balancer.dispatcherThreads=200 -Ddfs.balancer.max-size-to-move=100737418240 -threshold 10 1>/home/hdfs/balancer/balancer-out_$(date +"%Y%m%d%H%M%S").log 2>/home/hdfs/balancer/balancer-err_$(date +"%Y%m%d%H%M%S").log you can also refer to the below doc if you need any tuning https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/data-storage/content/balancer_commands.html

smdas · ‎04-08-2022

Greetings @wazzu62 We wish to check if you have reviewed @araujo ask for further checks on the concerned issue. If required, Change the Port for the ATS HBase from 17020 to any Value to see if the same helps, assuming the Port is configured to accept request. Regards, Smarak

nthomas · ‎02-22-2021

Are you able to Decommission the DN or NM? refer: https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/managing-and-monitoring-ambari/content/amb_decommission_a_nodemanager.html https://docs.cloudera.com/HDPDocuments/Ambari-2.7.3.0/managing-and-monitoring-ambari/content/amb_decommission_a_datanode.html To delete the host: https://docs.cloudera.com/HDPDocuments/Ambari-2.6.1.5/bk_ambari-operations/content/deleting_a_host_from_a_cluster.html Note: Turn off host Maintenance Mode, before deleting the host

wazzu62 · ‎09-12-2018

Thanks, this did work for me! Is there a way to configure the hadoop cluster to use a specific installed version of python?

wazzu62 · ‎06-12-2017

Thanks, adding the jar to HIVE_AUX_JARS_PATH in hive-env.sh got SerDe working in zeppelin

gkeys · ‎05-18-2017

You are living dangerously when you get to 80% disk usage. This is because batch jobs write intermediate data to local non-HDFS disk (map-reduce writes a lot of data to local disk, tez less so) and that temp data can approach or exceed 20% of available disk (depends of course on the jobs you are running). Also, if you are on physical servers (vs cloud) you need the lead time to provision, rack, stack etc to scale out and add new data nodes, and you likely will continue to ingest new data during this lead time. It is a good practice to set it at 70% and have a plan in place when it reaches that. (If you are ingesting large volumes on a scheduled basis, you may want to go lower). Another good practice is to compress data that you rarely process, using non-splittable codecs (you can decompress on the rare times you need the data) and possible other data that is still processed using splittable codecs. Automating compression is desirable. Compression is a bit of an involved topic. This is a useful first reference: http://www.dummies.com/programming/big-data/hadoop/compressing-data-in-hadoop/ I would compress or delete data in the cluster you are referencing, and add more data nodes ASAP.

wazzu62 · ‎05-11-2017

I figured out what was wrong. There are 2 HDFS configuration groups on this cluster. One is set up for the datanodes. I just needed to add the new servers to that group

kbadani · ‎05-11-2017

@Jon Page Can you please refer to my answere here: https://community.hortonworks.com/questions/92000/error-running-zeppelin-pyspark-interpreter-with-py.html#answer-92029

aervits · ‎09-25-2016

the manual step involved is to move Hive metastore, luckily we have steps outlined here https://community.hortonworks.com/articles/49660/moving-mysql-server-to-another-host.html then you can move other components using Ambari https://community.hortonworks.com/questions/20404/move-hive-server-from-one-node-to-another-in-hdp-c.html

Online	Offline
Last Visited	‎06-15-2022 11:19 AM

Member Since	‎06-21-2016 06:25 PM
Last Visited	‎06-15-2022 11:19 AM
Posts	40
Kudos received	1

Cloudera Community

Re: changing datanode.dir

Re: HDP HDFS balancer CLI not found

Re: ATSv2 HBase Application The HBase application ...

Re: Deleting a failed node/host

Re: multiple versions of python issues

Re: org.apache.hive.hcatalog.data.JsonSerDe not fo...

Re: HDFS Disk Usage and datanode storage threshold...

Re: changing datanode.dir

Re: zeppelin pyspark cannot run with different min...

Re: how to move hive and associated components fro...