Member since
12-11-2015
151
Posts
27
Kudos Received
26
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
63 | 01-09-2023 12:20 PM | |
2059 | 04-01-2020 11:19 PM | |
1727 | 03-23-2020 03:59 AM | |
818 | 03-22-2020 11:32 PM | |
2107 | 03-22-2020 06:32 AM |
01-09-2023
12:20 PM
2 Kudos
You can set quota on /tmp - Once quota is reached further write on the directory will fail. https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/scaling-namespaces/topics/hdfs-set-quotas-cm.html has the steps to enable quota
... View more
12-12-2022
08:10 AM
The jira HADOOP-9640 added the feature of allowing good users (not hogging NN rpc queue) to have fair response time. The explanation on how this works is available in this video https://www.youtube.com/watch?v=7Axz3bO18l8&ab_channel=DataWorksSummit
... View more
04-27-2020
12:22 AM
can you share the logs of the stuck application yarn logs -applicationId <appID> Replace appID with the stuck applicationID
... View more
04-26-2020
11:17 PM
Can you share the exact steps/list of configuration you changed, to configure kerberos in kafka? During this time of failure in broker - What is the exact error you notice on zookeeper side? 4:29:51.371 AM ERROR ZooKeeperClient [ZooKeeperClient] Auth failed. Did you tweak any configuration on zookeeper too?
... View more
04-26-2020
10:57 PM
Whats the error you see in flume agent logs? can you attach the whole log file?
... View more
04-01-2020
11:19 PM
Hi @Amn_468 Please configure it in CM > HDFS > Configuration >
Java Heap Size of NameNode in Bytes
Enter a value per requirement
Save and Restart
... View more
04-01-2020
09:33 PM
The one that is provided doesn't include the actual mapper logs. May I know how you collected this log? Can you run below command and provide the logs yarn logs -applicationId application_1585210670345_1594
... View more
03-31-2020
09:56 AM
Can you attach the oozie launcher application logs?
... View more
03-31-2020
09:50 AM
Are there any error in JHS logs especially around this timeframe 2020-03-31 13:14:* ?
... View more
03-29-2020
08:42 AM
The connection to zookeeper seems to fail from hs2. Is your zookeeper up and running? Also is Zookeeper service dependency enabled in HS2 configuration (CM > Hive > Configuration >ZooKeeper Service > Select service > Save and Restart)
... View more
03-27-2020
09:37 AM
The call to this region server 1.1.1.1:60020 is getting closed instantly Caused by: org.apache.hadoop.hbase.exceptions.ConnectionClosingException:
Call to hostname003.enterprisenet.org/1.1.1.1:60020 failed on local exception: org.apache.hadoop.hbase.exceptions.ConnectionClosingException: Connection to hostname003.enterprisenet.org/1.1.1.1:60020 is closing. Call id=4045, waitTime=2
1. Is there any hbase-site.xml that is bundled with your application jar? 2. If yes, Can you rebuild the jar with latest hbase-site.xml from /etc/hbase/conf/ 3. I am not sure if server is printing any ERROR but it will be worth to check, what exactly is happening on RS logs in node hostname003.enterprisenet.org at the time 2020 Mar 27 01:18:16 (i.e when the connection from client is closed)
... View more
03-27-2020
02:15 AM
Can you attach the full exception or the error log - Its unclear what is the actual error with the snippet you pinged in last response
... View more
03-26-2020
09:00 PM
These are 2 separate issues ERROR1: Did you delete till /disk{1,2,3,4,5}/yarn/nm/usercache/mcaf or you deleted till /disk{1,2,3,4,5}/yarn/nm/usercache/ If you had deleted till /disk{1,2,3,4,5}/yarn/nm/usercache/ then please restart all the nodemanagers. If not, Can you please let me know How many nodemanagers do you have in this cluster? Can you run namei -l /disk{1,2,3,4,5}/yarn/nm/usercache/ across all those machines? Please paste your result with "Insert or code sample" option in the portable so that it will has better readablity ERROR2: Mar26 11:36:00,863 main com.class.engineering.portfolio.dmxsloader.main.DMXSLoaderMain: org.apache.hadoop.hbase.client.RetriesExhaustedException thrown: Can't get the location a. The machine from which you are submitting this job - Does it have hbase gateway installed in it? If not can you run it from a machine which has hbase gateway b. Also since you said this job worked from hbase user and not mcaf - Have you attempted to grant permission to mcaf to the respective table which you are trying to access? https://docs.cloudera.com/documentation/enterprise/5-14-x/topics/cdh_sg_hbase_authorization.html#topic_8_3_2 has the steps c. What is the error you see in HMaster logs during the exact timestamp you notice this error in job?
... View more
03-25-2020
11:06 PM
These app cache directories gets auto generated upon job submission - So can you remove them from nodemanagers [so that it gets created fresh with required acls] /disk{1,2,3,4,5}/yarn/nm/usercache/mcaf and then re-submit the job again
... View more
03-23-2020
03:59 AM
" although same property ( dfs.datanode.balance.max.concurrent.moves) already exists in Cloudera Manager." --> Okay, I assume you are referring to the one highlighted in screenshot below Yes its unnecessary to add dfs.datanode.balance.max.concurrent.moves in Balancer Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml if you had used the "Maximum Concurrent Moves" section. Also note that this "Maximum Concurrent Moves" is scoped only to balancer and not to datanodes. So for datanodes you have to explicitly set it using " DataNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml" Regarding reason for why to add this property both for balancer and datanode is mentioned in my previous comment. Hope that clarifies and let me know if there are further questions I will raise an internal jira for correcting the document to avoid duplicate entry on balancer safety-valve.
... View more
03-22-2020
11:32 PM
Yes you can install CM offline after downloading the packages and - Its documented in this link https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_ig_create_local_package_repo.html#internal_package_repo Once the repo is ready you can install the binaries using the steps in link https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/install_cloudera_packages.html#id_z2h_pnm_25
... View more
03-22-2020
10:09 PM
This error usually happens if the client doesnt match the QOP on server. Can you share the connection string used in your code snippet? Is your hiveserver2 kerberised? Can you please share what is the value set for this property hive.server2.thrift.sasl.qop in your hiveserver2's hive-site.xml? Example connection string is in link https://github.com/dropbox/PyHive/pull/135/files/ec5270c4b6556bcd20f0f81afbced4a69ca9eff0
... View more
03-22-2020
08:48 PM
You would need to tune your heap in accordance with the number of files. The tuning guideline is in document https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.6.5/bk_command-line-installation/content/configuring-namenode-heap-size.html If you would like to get count of files, You may run hdfs dfs -count /
... View more
03-22-2020
08:16 PM
Just a correction The document suggest to tune property dfs.datanode.balance.max.concurrent.moves and not dfs.datanode.ec.reconstruction.xmits.weight Regarding the question of dfs.datanode.balance.max.concurrent.moves is already present in Datanode and balancer so why to add again. The doc says " Add the following code to the configuration field, for example, setting the value to 50." i.e 50 is just a example number and the document doesnt mandate setting this value to 50. You can tune it to any value of your requirement. Then why to add in both balancer and datanode? Setting it on HDFS Balancer(client) will give the flexibility to change this value on the client side at runtime i.e you can set this property to a value lesser or equal to what you have configured on the datanode side. Reason why we set this on server side is to impose a limit till what value the property can be configured. If you configure a value greater than what you have set on the Datanode(server), the datanodes fails it
... View more
03-22-2020
06:32 AM
The error suggests the DFSClient is unable to read the blocks due to connection failure. Either the ports are blocked or unreachable from the node From the node in which you are running the code snippet/From the node in which the executor ran, try reading the file using hdfs commands in debug mode which can give further clues on what node/service the client was trying to reach prior to connect timeout export HADOOP_ROOT_LOGGER=DEBUG,console
hdfs dfs -cat hdfs://ec2-18-234-71-106.compute-1.amazonaws.com:8020/dataset/Tech.csv
... View more
03-22-2020
06:09 AM
@erkansirin78 Let me make sure I understand the issue correctly. By this " Before restart, I saw totally different properties added." Did you mean the property dfs.datanode.ec.reconstruction.xmts.weight getting added? If yes, then its not getting added instead the preview page is just showing the extra lines prior to the property that you added, only the lines with + sign matters.
... View more
03-22-2020
05:42 AM
The capacity is not determined based on your cluster size rather estimated based on NN memory. Refer the code here for the computation details https://github.com/cloudera/hadoop-common/blob/134a29f22676e9a5226a07de6c6b9f34934a4625/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LightWeightGSet.java#L326-L362 BlockSize details are not available in rest response of NN. But you may use RestAPI on ClouderaManager to get the configs. Example is in link: https://docs.cloudera.com/documentation/enterprise/5-14-x/topics/cm_intro_api.html#xd_583c10bfdbd326ba--7f25092b-13fba2465e5--7f20__example_jrl_5ln_tm For Ambari check in link https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md
... View more
03-16-2020
07:24 AM
The individual api's called since namenode start can be obtained from NN JMX http://<nn_hostname>:<port>/jmx?qry=Hadoop:service=NameNode,name=RpcDetailedActivityForPort8020 For HiveServer2 similarly the metrics are in its webui too http://<hiveserver2_hostname>:<webui_port>/jmx https://cwiki.apache.org/confluence/display/Hive/Hive+Metrics The units are not per day aggregate instead its the aggregate since the service started so you can curl the jmx and then diff it on a per day basis
... View more
03-16-2020
06:41 AM
@MAS The BlockCapacity metric is different from Blocksize. Per the metrics page [https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Metrics.html], block capacity denotes "Current number of block capacity" its unit is not bytes/mb it is number of blocks
... View more
03-16-2020
01:44 AM
Yeah, thats right. Unfortunately there is no feature available to gather Nifi Lineage in Navigator.
... View more
03-16-2020
01:22 AM
Right, In CDH its not available but with CDP you have options to install Atlas which already have integration with NiFi https://docs.cloudera.com/cdpdc/7.0/overview/topics/cdpdc-overview.html Data Engineering Ingest, transform, and analyze data. Services: HDFS, YARN, YARN Queue Manager, Ranger, Atlas, Hive Metastore, Hive on Tez, Spark, Oozie, Hue, and Data Analytics Studio Data Mart Browse, query, and explore your data in an interactive way. Services: HDFS, YARN, YARN Queue Manager, Ranger, Atlas, Hive Metastore, Impala, and Hue Operational Database Low latency writes, reads, and persistent access to data for Online Transactional Processing (OLTP) use cases. Services: HDFS, Ranger, Atlas, and HBase https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.4.1.1/installing-hdf/content/configure_nifi_for_atlas_integration.html
... View more
03-16-2020
01:13 AM
Can you share the full stack trace of the exception to understand further
... View more
03-16-2020
01:09 AM
At present Navigator doesn't support gathering lineage from NiFi - however within nifi there is lineage of flowfile. You can get the steps from this link https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.4.1.1/getting-started-with-apache-nifi/content/lineage-graph.html
... View more
03-10-2020
07:57 PM
49213 open("/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6110205147654050510.8", O_RDWR|O_CREAT|O_EXCL, 0666) = -1 EACCES (Permission denied) During this step, the script is trying to open and get file descriptor for this directory and it was denied access /var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6110205147654050510.8 So far we have inspected its parent directories and haven't seen any issues with. Can we get details of this directory too ls -ln /var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6110205147654050510.8
stat /var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir/libleveldbjni-64-1-6110205147654050510.8
id yarn
... View more
03-10-2020
08:57 AM
This is much clear now On server side the request was rejected as the client was initiating non-ssl connection Caused by: org.apache.thrift.transport.TTransportException: javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection? Client side it was unable to trust the server certs as it was not configured to use a truststore Caused by: com.cloudera.hiveserver2.support.exceptions.GeneralException: [Cloudera][HiveJDBCDriver](500164) Error initialized or created transport for authentication: [Cloudera][HiveJDBCDriver](500169) Unable to connect to server: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target. You got to add few more properties to your connection string jdbc:hive2://vdbdgw01dsy.dsone.3ds.com:10000/default;AuthMech=1;KrbAuthType=1;KrbHostFQDN=vdbdgw01dsy.dsone.3ds.com;KrbRealm=DSONE.3DS.COM;KrbServiceName=hive;LogLevel=6;LogPath=d:/TestPLPFolder/hivejdbclog;SSL=1;SSLTrustStore=<path_to_truststore>;SSLTrustStorePwd=<password to truststore>
If you dont have password to your truststore you can omit the parameter SSLTrustStorePwd
... View more