About willx

willx · ‎08-02-2022

Hello @syedshakir , Please let us know what is your cdh version? Case A: If I'm understanding correctly you have a kerberized cluster and the file is at local not on hdfs, so you don't need kerberos authentication. Just refer to below google docs, there are a few ways to do it: https://cloud.google.com/storage/docs/uploading-objects#upload-object-cli Case B: To be honest I never did it so I would try: 1. follow the below document to configure google cloud storage with hadoop: https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_gcs_config.html 2. if distcp cannot work then follow this document to configure some properties: https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cdh_admin_distcp_secure_insecure.html 3. save the whole output of distcp then upload to here, I can help you to check. Remember to remove the sensitive information (such as hostname, ip) from the logs then you can upload. If the distcp output doesn't contain kerberos related errors then you can enable debug logs then re-run the distcp job and save the new output with debug logs: export HADOOP_ROOT_LOGGER=hadoop.root.logger=Debug,console;export HADOOP_OPTS="-Dsun.security.krb5.debug=true" Thanks, Will

VidyaSargur · ‎04-28-2022

@arunr307, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.

kunzhao · ‎01-20-2022

Impala Command Line Argument Advanced Configuration Snippet (Safety Valve) -kudu_mutation_buffer_size=20971520 -kudu_error_buffer_size=20971520 Tablet Server Advanced Configuration Snippet (Safety Valve) for gflagfile -max_cell_size_bytes=20971520 ========================= As above setted，It is working fine. Thanks .

willx · ‎01-18-2022

Hi @naveenks, Please refer to below doc: https://docs.cloudera.com/documentation/enterprise/5-16-x/topics/cdh_admin_distcp_data_cluster_migrate.html Thanks, Will

DianaTorres · ‎12-20-2021

@Kallem Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks!

ryu · ‎12-16-2021

Hi @willx , Is there a way to see if the hadoop path is a volume or a directory?

AnuradhaV · ‎11-22-2021

Thanks for your suggestion. I tried using double slash but it not worked for me.

willx · ‎11-09-2021

Hi @loridigia, Based on the current error you provided "org.apache.hadoop.hbase.NotServingRegionException: table XXX is not online on worker04" maybe some regions are not deployed on any RegionServers yet. please check this result to see is there any inconsistencies on this table: 1. sudo -u hbase hbase hbck -details > /tmp/hbck.txt 2. If you see inconsistencies please grep ERROR from hbck.txt you will see which region has problem. 3. Then you need to check if this region's directory is complete in this result: hdfs dfs -ls -R /hbase 4. Then need to check in hbase shell : scan 'hbase:meta', if this region's info are updated in hbase:meta table. 5. Based on type of the issue we need to use hbck2 jar to fix the inconsistencies. https://github.com/apache/hbase-operator-tools/tree/master/hbase-hbck2 These are general steps to deal with this kind of problem, there could be more complex issues behind it. We suggest you to file a case with Cloudera support. Thanks, Will

willx · ‎10-28-2021

Hi @uygg, Please check if 3rd party jars like Bouncy castle jars are added. If that is the cause please remove them then restart RM. Thanks, Will

PrathapKumar · ‎10-25-2021

Hi, @kras Thank you for writing back with your observation. Can you please check the below details as well? 1) When the Region Server JVM reports High CPU, Open "top" Command for the Region Server PID, 2) Use "Shift H" to open the Thread View of the PID. This would show the Threads within the Region Server JVM with CPU Usage, 3) Monitor the Thread View & Identify the Thread hitting the Max CPU Usage, 4) Take Thread Dump | JStack of Region Server PID & Compare the Thread with the "top" Thread View consuming the Highest CPU. 5) Check the CUP usage of the other services that are hosted on the Region Server host. The above Process would allow you to identify the Thread contributing towards the CPU Usage. Compare the same with other Region Server & your Team can make a Conclusive Call to identify the reasoning for CPU Utilization. Howsoever Logs are reviewed, Narrowing the Focus of JVM review would assist in identifying the Cause. Review shared Link for additional reference. Ref: https://www.infoworld.com/article/3336222/java-challengers-6-thread-behavior-in-the-jvm.html https://blogs.manageengine.com/application-performance-2/appmanager/2011/02/09/identify-java-code-co... https://blog.jamesdbloom.com/JVMInternals.html Thanks & Regards, Prathap Kumar.

Online	Offline
Last Visited	‎10-22-2025 08:38 PM

Member Since	‎10-03-2020 06:12 AM
Last Visited	‎10-22-2025 08:38 PM
Posts	236
Kudos received	14

Cloudera Community

Re: Datanode and Impala Daemon Instances Show Unkn...

Re: Services not starting up after Enabling Kerber...

Re: What is the difference between volumes and fol...

Re: Hbase labels table creation

Re: All Hdfs file names older than N days

Re: Copy HDFS data to GCP

Re: Hbase export - split the sequence files with s...

Re: WARNINGS: Error applying Kudu Op.: Incomplete:...

Re: Copying HDFS data in CDH 5.16 to any cloud sto...

Re: Slow ReadProcessor warnings leading to applica...

Re: What is the difference between volumes and fol...

Re: Unable to upload a file with special character...

Re: HBASE NotServingRegionException

Re: cdh6.3.2版本的yarn的resourcemanager启动失败问题？？？

Re: HBase latency spikes every 10 minutes