About jsensharma

jsensharma · ‎09-07-2018

@Andrew Mills Regarding the following error ERROR 2018-09-06 13:09:36,166 NetUtil.py:96 - EOF occurred in violation of protocol (_ssl.c:579) You can get more detailed explanation on why do we see this error and how to remediate it here: https://community.hortonworks.com/articles/188269/javapython-updates-and-ambari-agent-tls-settings.html Summary: This can happen when Ambari Agent is trying to communicate with the Ambari Server using TLSv1, instead of the TLS version mandated by upgraded JDK which is TLSv1.2. There are two situations that have to be considered when solving this problem: 1.) If you are running CentOS 6 or SLES 11 the version of Python (2.6.x) does not work with TLSv1.2, so you must make changes to your newly updated JDK in order to proceed. 2.) If you are running CentOS 7, Debian 7, Ubuntu 14 & 16, or SLES 12 the version of Python (2.7.x) does work with TLS v1.2, so you only have to make changes to the Ambari Agent configuration to tell it to use TLS v1.2 in order to proceed. Solution For CentOS 7, Debian 7, Ubuntu 14 & 16, or SLES 12 (Python 2.7) To solve this problem simply configure the Ambari Agent to use TLSv1.2 when communicating with the Ambari Server by editing each Ambari Agent’s /etc/ambari-agent/conf/ambari-agent.ini file and adding the following configuration property to the security section: [security] force_https_protocol=PROTOCOL_TLSv1_2 Once this configuration change has been made, the Ambari Agent needs to be restarted. After restarting you should no longer see the ERROR’s in the Ambari Agent logs, and in the Ambari Server UI you’ll notice that all Ambari Agents are once again heartbeating. Solution for CentOS 6, or SLES 11 (Python 2.6) In this scenario the only way forward is to edit the java.security file in the JDK being used by the Ambari Server and make the following changes: Locate the jdk.tls.disabledAlgorithms property and remove the 3DES_EDE_CBC reference Save the file, and restart the Ambari Server .

jsensharma · ‎09-07-2018

@Wayne Pakkala You can try any ofm the following approach to address this issue: Option-1). From MySQL Side: Edit the "/etc/my.cnf" file to match timezone to EDT or something like following (please pick correct value): default_time_zone='+08:00' Restart mysql server (OR try the following) Please choose your own time_zone, following is just for an example. mysql> SET GLOBAL time_zone = 'Australia/Sydney'; Following by a MySQL restart. Reference: https://dev.mysql.com/doc/refman/8.0/en/time-zone-support.html Option-2). You can also try adding the "serverTimezone=EDT" in your MySQL URL in hive settings from Ambari UI: Change From: jdbc:mysql://xxxx.yyyy/hive?createDatabaseIfNotExist=true Change To: jdbc:mysql://xxxx.yyyy/hive?createDatabaseIfNotExist=true&serverTimezone=EDT

jsensharma · ‎09-05-2018

@Arjun Das May be you can download it from Derby site ... or just for testing you can try this: http://www.java2s.com/Code/Jar/d/Downloadderby101011jar.htm

jsensharma · ‎09-05-2018

@Arjun Das As you are basically getting the error: Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class org.apache.derby.jdbc.AutoloadedDriver40 . So can you please confirm if you have placed the Driver JAR which contains the class "org.apache.derby.jdbc.AutoloadedDriver40" inside the $SQOOP_HOME/lib directory of your Sqoop installation. As described in : https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_data-movement-and-integration/content/apache_sqoop_connectors.html

jsensharma · ‎09-05-2018

@Jalender "percentFilesLocal" which is basically a Percent of store file data that can be read from the local DataNode, 0-100. This is actually a Region Server Property hence you should be able to get that value by making the JMX call to individual RegionServer. Like following: # curl -S "http://region_1.example.com:16030/jmx?qry=Hadoop:service=HBase,name=RegionServer,sub=Server" | grep 'percentFilesLocal' . . # curl -S "http://region_2.example.com:16030/jmx?qry=Hadoop:service=HBase,name=RegionServer,sub=Server" | grep 'percentFilesLocal' .

jsensharma · ‎09-05-2018

@Taehyeon Lee The problem seems to be with your MySQL Database. As we see the following cause for the failure. Metastore connection URL: jdbc:mysql://slave1.xxxxxx.com/hive?createDatabaseIfNotExist=true Metastore Connection Driver : com.mysql.jdbc.Driver Metastore connection User: hive org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version. Underlying cause: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException : Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server. SQL Error code: 0 Use --verbose for detailed stacktrace. schemaTool failed Then "Communications link failure" error indicates that your MySQL database might not be running or it might have some communication issue. So please check if the MySQL server is running on the mentioned host "slave1.xxxxxx.com"? And are we able to access its default port (3306) from the Spark Host? # telnet slave1.xxxxx.com 3306 . You should also check & edit the "bind-address" attribute inside your "/etc/my.cnf" to make it bind on hostname or all listen address. bind-address=0.0.0.0 . https://dev.mysql.com/doc/refman/5.7/en/server-options.html If the address is 0.0.0.0 , the server accepts TCP/IP connections on all server host IPv4 interfaces. If the address is :: , the server accepts TCP/IP connections on all server host IPv4 and IPv6 interfaces. Also on the MySQL server host please check if the port 3306 is listening And MySQL is running fine? Try restarting MySQl service. Check if the Firewall is disabled on mySQL server host? # netstat -tnlpa | grep 3306 # service iptables stop # systemctl disable firewalld . Some more details about troubleshooting the "Communications link failure" can be found here: https://community.hortonworks.com/questions/139703/hive-metastore-trouble-with-jbdc-mysql.html

jsensharma · ‎09-05-2018

@Jalender Are you talking about the "Percent Files Local": Percentage of files served from the local DataNode for the RegionServer. In that case you can use AMS Grafana/Metrics. Ambari Metrics Grafana: Search for "Percent Files Local" http://$GRAFANA_HOST:3000/dashboard/db/hbase-regionservers Or by directly making the AMS API call: http://$AMBARI_METRICS_COLLECTOR_HOST:6188/ws/v1/timeline/metrics/?metricNames=regionserver.Server.percentFilesLocal._avg&hostname=%&appId=hbase&instanceId=&startTime=1536116047&endTime=1536116347 Reference: https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.2/bk_ambari-operations/content/grafana_hbase_regionservers.html

jsensharma · ‎09-05-2018

@Michael Bronson As the file path which you shared is on HDFS : /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz To identify "corrupt" or "missing" blocks, the command-line command can be used to knwo whether it is healthy or not? # su - hdfs -c "hdfs fsck /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz" . Connecting to namenode via http://hdfcluster2.example.com:50070/fsck?ugi=hdfs&path=%2Fhdp%2Fapps%2F2.6.4.0-91%2Fspark2%2Fspark2-hdp-yarn-archive.tar.gz FSCK started by hdfs (auth:SIMPLE) from /172.22.197.159 for path /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz at Wed Sep 05 01:51:25 UTC 2018 .Status: HEALTHY Total size: 189997800 B Total dirs: 0 Total files: 1 Total symlinks: 0 Total blocks (validated): 2 (avg. block size 94998900 B) Minimally replicated blocks: 2 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor: 3 Average block replication: 3.0 Corrupt blocks: 0 Missing replicas: 0 (0.0 %) Number of data-nodes: 4 Number of racks: 1 FSCK ended at Wed Sep 05 01:51:25 UTC 2018 in 35 milliseconds The filesystem under path '/hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz' is HEALTHY HDFS will attempt to recover the situation automatically. By default there are three replicas of any block in the cluster. so if HDFS detects that one replica of a block has become corrupt or damaged, HDFS will create a new replica of that block from a known-good replica, and will mark the damaged one for deletion. The chances of three replicas of the same block becoming damaged is so remote that it would suggest a significant failure somewhere else in the cluster. If this situation does occur, and all three replicas are damaged, then 'hdfs fsck' will report that block as "corrupt" - i.e. HDFS cannot self-heal the block from any of its replicas. Although there are some Articles which can be referred to fix the "Under replicated Blocks" like: https://community.hortonworks.com/articles/4427/fix-under-replicated-blocks-in-hdfs-manually.html How to fix missing/corrupted/under or over-replicated blocks? https://community.hortonworks.com/content/supportkb/49106/how-to-fix-missingcorruptedunder-or-over-replicate.html .

jsensharma · ‎09-05-2018

@Michael Bronson As the NameNode Report and UI (including ambari UI) shows that your DFS used is reaching almsot 87% to 90% hence it will be really good if you can increase the DFS capacity. In order to understand in detail about the Non DFS Used = Configured Capacity - DFS Remaining - DFS Used YOu can refer to the following article which aims at explaining the concepts of Configured Capacity, Present Capacity, DFS Used,DFS Remaining, Non DFS Used, in HDFS. The diagram below clearly explains these output space parameters assuming HDFS as a single disk. https://community.hortonworks.com/articles/98936/details-of-the-output-hdfs-dfsadmin-report.html . The above is one of the best article to understand the DFS and Non-DFS calculations and remedy. You add capacity by giving dfs.datanode.data.dir more mount points or directories. In Ambari that section of configs is I believe to the right depending the version of Ambari or in advanced section, the property is in hdfs-site.xml. the more new disk you provide through comma separated list the more capacity you will have. Preferably every machine should have same disk and mount point structure. .

jsensharma · ‎09-04-2018

@Michael Bronson As "HDFS Disk Usage" shows: The percentage of distributed file system (DFS) used, which is a combination of DFS and non-DFS used. The NameNode commands/UI shows that the DFS Used is around 87.06% and Non DFS Used is 0% So which is almost same which ambari is showing like almost 88% (DFS + Non DFS Usage) so there seems to be no contradiction to me. Please let us know what is the value you are expecting.

Member Since	‎03-14-2016 01:07 PM
Last Visited
Posts	4,721
Kudos received	1095

Cloudera Community

Re: set Variable in ambari rest API

Re: how to stop Hive Metastore and HiveServer2 by...

Re: how to verify by ambari api the active/standby...

Re: Curl throws error when running allow snapshot

Re: ambari server + REASON: Server not yet listeni...

Re: Python 2.7.5-69 compatability with HDP 2.6.4

Re: Error setting up hive on HDP 2.6.5.....timezon...

Re: "ERROR: java.lang.NoClassDefFoundError: Could ...

Re: "ERROR: java.lang.NoClassDefFoundError: Could ...

Re: API to pull HBase data locality

Re: I can't start spark2 service..

Re: API to pull HBase data locality

Re: what could be the cause for spark2-hdp-yarn-ar...

Re: HDFS is almost full 90% but data node disks ar...

Re: HDFS is almost full 90% but data node disks ar...