About venkatsambath

venkatsambath · ‎12-12-2022

The jira HADOOP-9640 added the feature of allowing good users (not hogging NN rpc queue) to have fair response time. The explanation on how this works is available in this video https://www.youtube.com/watch?v=7Axz3bO18l8&ab_channel=DataWorksSummit

venkatsambath · ‎04-26-2020

Can you share the exact steps/list of configuration you changed, to configure kerberos in kafka? During this time of failure in broker - What is the exact error you notice on zookeeper side? 4:29:51.371 AM ERROR ZooKeeperClient [ZooKeeperClient] Auth failed. Did you tweak any configuration on zookeeper too?

venkatsambath · ‎04-01-2020

Hi @Amn_468 Please configure it in CM > HDFS > Configuration > Java Heap Size of NameNode in Bytes Enter a value per requirement Save and Restart

venkatsambath · ‎03-31-2020

Are there any error in JHS logs especially around this timeframe 2020-03-31 13:14:* ?

venkatsambath · ‎03-27-2020

The call to this region server 1.1.1.1:60020 is getting closed instantly Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Call to hostname003.enterprisenet.org/1.1.1.1:60020 failed on local exception: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to hostname003.enterprisenet.org/1.1.1.1:60020 is closing. Call id=4045, waitTime=2 1. Is there any hbase-site.xml that is bundled with your application jar? 2. If yes, Can you rebuild the jar with latest hbase-site.xml from /etc/hbase/conf/ 3. I am not sure if server is printing any ERROR but it will be worth to check, what exactly is happening on RS logs in node hostname003.enterprisenet.org at the time 2020 Mar 27 01:18:16 (i.e when the connection from client is closed)

venkatsambath · ‎03-27-2020

Can you attach the full exception or the error log - Its unclear what is the actual error with the snippet you pinged in last response

venkatsambath · ‎03-26-2020

These are 2 separate issues ERROR1: Did you delete till/disk{1,2,3,4,5}/yarn/nm/usercache/mcaf or you deleted till /disk{1,2,3,4,5}/yarn/nm/usercache/ If you had deleted till /disk{1,2,3,4,5}/yarn/nm/usercache/ then please restart all the nodemanagers. If not, Can you please let me know How many nodemanagers do you have in this cluster? Can you run namei -l /disk{1,2,3,4,5}/yarn/nm/usercache/ across all those machines? Please paste your result with "Insert or code sample" option in the portable so that it will has better readablity ERROR2: Mar26 11:36:00,863 main com.class.engineering.portfolio.dmxsloader.main.DMXSLoaderMain: org.apache.hadoop.hbase.client.RetriesExhaustedException thrown: Can't get the location a. The machine from which you are submitting this job - Does it have hbase gateway installed in it? If not can you run it from a machine which has hbase gateway b. Also since you said this job worked from hbase user and not mcaf - Have you attempted to grant permission to mcaf to the respective table which you are trying to access? https://docs.cloudera.com/documentation/enterprise/5-14-x/topics/cdh_sg_hbase_authorization.html#topic_8_3_2 has the steps c. What is the error you see in HMaster logs during the exact timestamp you notice this error in job?

venkatsambath · ‎03-25-2020

These app cache directories gets auto generated upon job submission - So can you remove them from nodemanagers [so that it gets created fresh with required acls] /disk{1,2,3,4,5}/yarn/nm/usercache/mcaf and then re-submit the job again

venkatsambath · ‎03-23-2020

"although same property (dfs.datanode.balance.max.concurrent.moves) already exists in Cloudera Manager." --> Okay, I assume you are referring to the one highlighted in screenshot below Yes its unnecessary to add dfs.datanode.balance.max.concurrent.moves in Balancer Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml if you had used the "Maximum Concurrent Moves" section. Also note that this "Maximum Concurrent Moves" is scoped only to balancer and not to datanodes. So for datanodes you have to explicitly set it using " DataNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml" Regarding reason for why to add this property both for balancer and datanode is mentioned in my previous comment. Hope that clarifies and let me know if there are further questions I will raise an internal jira for correcting the document to avoid duplicate entry on balancer safety-valve.

venkatsambath · ‎03-22-2020

Yes you can install CM offline after downloading the packages and - Its documented in this link https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_ig_create_local_package_repo.html#internal_package_repo Once the repo is ready you can install the binaries using the steps in link https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/install_cloudera_packages.html#id_z2h_pnm_25

Online	Offline
Last Visited	‎01-02-2025 01:22 PM

Member Since	‎12-11-2015 07:09 AM
Last Visited	‎01-02-2025 01:22 PM
Posts	208
Kudos received	30

Cloudera Community

Re: Utilization Report - Cloudera Platform

Re: Run 2 kerberos ticket in a server for transfer...

Re: in-place upgrade CM problem(CM 7.4.4 to CM 7.7...

Re: Hive query failed with java.io.IOException: Ca...

Re: limit the size of files that an application ca...

Re: How to restrict RPC bandwidth of specific user...

Re: Unabke to start brokers and zookeepers with au...

Re: Name Node Pause duration

Re: Unknown Job ID for long running jobs on Histor...

Re: Yarn jobs are failing after enabling MIT-Kerbe...

Re: Yarn jobs are failing after enabling MIT-Kerbe...

Re: Yarn jobs are failing after enabling MIT-Kerbe...

Re: Yarn jobs are failing after enabling MIT-Kerbe...

Re: HDFS Balancer: Why configure same property?

Re: cloudera manger offline installation