Member since
07-13-2017
19
Posts
0
Kudos Received
0
Solutions
01-18-2019
01:03 AM
@Bharath Good articles regards to tune Hive performance: Hive_performance_tune Tez_Performance_Tune . ExplainPlan This is too broad question to answer, here are my thoughts: 1.Check is your HiveJob is getting started running in Resource manager(not in queue waiting for resources i.e Accepted state..etc) 2.Check in HDFS how many files are there in the table pointed directory, too many small files will result poor performance. 3.Try running hive console in debug mode to see where the job is taking time to execute. 4.Check is there any skew's in the data and create table stating all these skewed columns in the table properties.
... View more
05-23-2018
05:49 AM
@Bharath N
Try to perform the following steps on the failed DataNode: Get the list of DataNode directories from /etc/hadoop/conf/hdfs-site.xml using the following command: $ grep -A1 dfs.datanode.data.dir /etc/hadoop/conf/hdfs-site.xml
<name>dfs.datanode.data.dir</name>
<value>/data0/hadoop/hdfs/data,/data1/hadoop/hdfs/data,/data2/hadoop/hdfs/data,
/data3/hadoop/hdfs/data,/data4/hadoop/hdfs/data,/data5/hadoop/hdfs/data,/data6/hadoop/hdfs/data,
/data7/hadoop/hdfs/data,/data8/hadoop/hdfs/data,/data9/hadoop/hdfs/data</value> Get datanodeUuid by grepping the DataNode log: $ grep "datanodeUuid=" /var/log/hadoop/hdfs/hadoop-hdfs-datanode-$(hostname).log | head -n 1 |
perl -ne '/datanodeUuid=(.*?),/ && print "$1\n"'
1dacef53-aee2-4906-a9ca-4a6629f21347 Copy over a VERSION file from one of the <dfs.datanode.data.dir>/current/ directories of a healthy running DataNode: $ scp <healthy datanode host>:<dfs.datanode.data.dir>/current/VERSION ./ Modify the datanodeUuid in the VERSION file with the datanodeUuid from the above grep search: $ sed -i.bak -E 's|(datanodeUuid)=(.*$)|\1=1dacef53-aee2-4906-a9ca-4a6629f21347|' VERSION Blank out the storageID= property in the VERSION file: $ sed -i.bak -E 's|(storageID)=(.*$)|\1=|' VERSION Copy this modified VERSION file to the current/ path of every directory listed in dfs.datanode.data.dir property of hdfs-site.xml: $ for i in {0..9}; do cp VERSION /data$i/hadoop/hdfs/data/current/; done Change permissions on this VERSION file to be owned by hdfs:hdfs with permissions 644: $ for i in {0..9}; do chown hdfs:hdfs /data$i/hadoop/hdfs/data/current/VERSION; done
$ for i in {0..9}; do chmod 664 /data$i/hadoop/hdfs/data/current/VERSION; done One more level down, there is a different VERSION file located under the Block Pool current folder at: /data0/hadoop/hdfs/data/current/BP-*/current/VERSION This file does not need to be modified -- just place then in the appropriate directories. Copy over this particular VERSION file from a healthy DataNode into the current/BP-*/current/ folder for each directory listed in dfs.datanode.data.dir of hdfs-site.xml: $ scp <healthy datanode host>:<dfs.datanode.data.dir>/current/BP-*/current/VERSION ./VERSION2
$ for i in {0..9}; do cp VERSION2 /data$i/hadoop/hdfs/data/current/BP-*/current/VERSION; done Change permissions on this VERSION file to be owned by hdfs:hdfs with permissions 644: $ for i in {0..9}; do chown hdfs:hdfs /data$i/hadoop/hdfs/data/current/BP-*/current/VERSION; done
$ for i in {0..9}; do chmod 664 /data$i/hadoop/hdfs/data/current/BP-*/current/VERSION; done Restart DataNode from Ambari. The VERSION file located at <dfs.datanode.data.dir>/current/VERSION will have its storageID repopulated with a regenerated ID. If any data is not an issue (say, for example, the node was previously in a different cluster, or was out of service for an extended time), then
delete all data and directories in the dfs.datanode.data.dir (keep that directory, though), restart the data node daemon or servic
... View more
11-20-2017
06:33 PM
My personal preference is HDFS put over flume if those are the options. Even better would be HDF, but it sounds like a simple HDFS put would solve it
... View more
08-27-2018
03:29 PM
Make sure your SecondaryNamenode, or standby Namenode is running it will keep checkpointing and removing old edit logs.
... View more
04-01-2019
03:23 PM
If you set these properties by hand, take into account that in HDP 3.x the route for the Hive Warehouse has been changed to: /warehouse/tablespace/managed/hive Also in the Ambari configuration for Druid 3.1 the property is set to: hive.druid.storage.storageDirectory = {{druid_storage_dir}} which is expanded to /apps/druid/warehouse, a different path from the Hive Warehouse.
... View more
10-02-2017
08:22 PM
@Jay SenSharma # ps -ef | grep hiveserver2 I see the process running, Also Ambari UI has the hiveserver2 running screen-shot-2017-10-02-at-21921-pm.png 2. # netstat -tnlpa | grep `cat /var/run/hive/hive-server.pid` I see a list, but I don't see any process with 10000.Instead I see couple of them with 10001 3. less /var/log/hive/hiveserver2.log > 2017-10-02 00:00:36,108 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: thrift.ThriftHttpServlet
(ThriftHttpServlet.java:doPost(145)) - Could not validate cookie sent, will try
to generate a new cookie 2017-10-02 00:00:36,108 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: thrift.ThriftHttpServlet
(ThriftHttpServlet.java:doPost(204)) - Cookie added for clientUserName
anonymous 2017-10-02 00:00:36,108 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: thrift.ThriftCLIService
(ThriftCLIService.java:OpenSession(316)) - Client protocol version:
HIVE_CLI_SERVICE_PROTOCOL_V8 2017-10-02 00:00:36,109 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: metastore.ObjectStore
(ObjectStore.java:initializeHelper(377)) - ObjectStore, initialize called 2017-10-02 00:00:36,111 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]:
metastore.MetaStoreDirectSql (MetaStoreDirectSql.java:<init>(139)) -
Using direct SQL, underlying DB is OTHER 2017-10-02 00:00:36,111 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: metastore.ObjectStore
(ObjectStore.java:setConf(291)) - Initialized ObjectStore 2017-10-02 00:00:36,111 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: metastore.HiveMetaStore
(HiveMetaStore.java:init(533)) - Begin calculating metadata count metrics. 2017-10-02 00:00:36,113 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: metastore.HiveMetaStore
(HiveMetaStore.java:init(535)) - Finished metadata count metrics: 18 databases,
1009 tables, 33 partitions. 2017-10-02 00:00:36,113 WARN
[HiveServer2-HttpHandler-Pool: Thread-82257]: metrics2.CodahaleMetrics
(CodahaleMetrics.java:addGauge(299)) - A Gauge with name [init_total_count_dbs]
already exists. The old gauge will be
overwritten, but this is not recommended 2017-10-02 00:00:36,113 WARN
[HiveServer2-HttpHandler-Pool: Thread-82257]: metrics2.CodahaleMetrics
(CodahaleMetrics.java:addGauge(299)) - A Gauge with name
[init_total_count_tables] already exists.
The old gauge will be overwritten, but this is not recommended 2017-10-02 00:00:36,113 WARN
[HiveServer2-HttpHandler-Pool: Thread-82257]: metrics2.CodahaleMetrics
(CodahaleMetrics.java:addGauge(299)) - A Gauge with name [init_total_count_partitions]
already exists. The old gauge will be
overwritten, but this is not recommended 2017-10-02 00:00:36,125 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.SessionState
(SessionState.java:createPath(677)) - Created local directory: /tmp/de11c83f-e087-406e-b715-dd6ba7148cfe_resources 2017-10-02 00:00:36,128 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.SessionState
(SessionState.java:createPath(677)) - Created HDFS directory:
/tmp/hive/anonymous/de11c83f-e087-406e-b715-dd6ba7148cfe 2017-10-02 00:00:36,129 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.SessionState
(SessionState.java:createPath(677)) - Created local directory:
/tmp/hive/de11c83f-e087-406e-b715-dd6ba7148cfe 2017-10-02 00:00:36,130 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.SessionState
(SessionState.java:createPath(677)) - Created HDFS directory:
/tmp/hive/anonymous/de11c83f-e087-406e-b715-dd6ba7148cfe/_tmp_space.db 2017-10-02 00:00:36,130 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.HiveSessionImpl
(HiveSessionImpl.java:setOperationLogSessionDir(265)) - Operation log session
directory is created:
/tmp/hive/operation_logs/de11c83f-e087-406e-b715-dd6ba7148cfe 2017-10-02 00:00:36,189 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.HiveSessionImpl
(HiveSessionImpl.java:acquireAfterOpLock(333)) - We are setting the hadoop
caller context to de11c83f-e087-406e-b715-dd6ba7148cfe for thread
HiveServer2-HttpHandler-Pool: Thread-82257 2017-10-02 00:00:36,190 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.HiveSessionImpl
(HiveSessionImpl.java:releaseBeforeOpLock(357)) - We are resetting the hadoop
caller context for thread HiveServer2-HttpHandler-Pool: Thread-82257 2017-10-02 00:00:36,196 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.HiveSessionImpl
(HiveSessionImpl.java:acquireAfterOpLock(333)) - We are setting the hadoop
caller context to de11c83f-e087-406e-b715-dd6ba7148cfe for thread
HiveServer2-HttpHandler-Pool: Thread-82257 2017-10-02 00:00:36,196 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.HiveSessionImpl
(HiveSessionImpl.java:releaseBeforeOpLock(357)) - We are resetting the hadoop
caller context for thread HiveServer2-HttpHandler-Pool: Thread-82257 2017-10-02 00:00:36,210 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.HiveSessionImpl
(HiveSessionImpl.java:acquireAfterOpLock(333)) - We are setting the hadoop
caller context to de11c83f-e087-406e-b715-dd6ba7148cfe for thread
HiveServer2-HttpHandler-Pool: Thread-82257 2017-10-02 00:00:36,211 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.HiveSessionImpl
(HiveSessionImpl.java:releaseBeforeOpLock(357)) - We are resetting the hadoop
caller context for thread HiveServer2-HttpHandler-Pool: Thread-82257 2017-10-02 00:00:36,211 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.HiveSessionImpl
(HiveSessionImpl.java:acquireAfterOpLock(333)) - We are setting the hadoop
caller context to de11c83f-e087-406e-b715-dd6ba7148cfe for thread
HiveServer2-HttpHandler-Pool: Thread-82257 2017-10-02 00:00:36,212 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: session.HiveSessionImpl
(HiveSessionImpl.java:releaseBeforeOpLock(357)) - We are resetting the hadoop
caller context for thread HiveServer2-HttpHandler-Pool: Thread-82257 2017-10-02 00:03:36,081 INFO
[HiveServer2-HttpHandler-Pool: Thread-82257]: thrift.ThriftHttpServlet
(ThriftHttpServlet.java:doPost(145)) - Could not validate cookie sent,
wi/var/log/hadoop/hive/hiveserver2.log What I might be missing here?
... View more