Support Questions

Find answers, ask questions, and share your expertise

hive query slowness issue

New Contributor

We are getting HIVE query slowness issue (using TEZ engine). Out of assigned mappers, most of them immediately getting SUCCESS except one or two mappers. We are suspecting most of the load moves to one or two mappers and others are not sharing the load. This slowness even worse when we use more JOINS in the query. We checked all the container logs and there is no major exceptions which is halting the process (still added the same below). Status: Running (Executing on YARN cluster with App id application_1487593540459_2564) -------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 SUCCEEDED 0 0 0 0 0 0 Map 3 ... RUNNING 3 1 2 0 0 0 Map 4 .......... SUCCEEDED 3 3 0 0 0 0 Map 5 RUNNING 1 0 1 0 0 0 Map 6 SUCCEEDED 0 0 0 0 0 0 Reducer 2 ...... SUCCEEDED 3 3 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 04/06 [==================>>--------] 70% ELAPSED TIME: 2214.35 s -------------------------------------------------------------------------------- File does not exist: /apps/hive/warehouse/al_uat_test.db/al_vehicle_dim_orc/delta_0002123_0002123/bucket_00000_flush_length 2017-03-23 10:21:15,530 [WARN] [TezChild] |retry.RetryInvocationHandler|: Exception while invoking ClientNamenodeProtocolTranslatorPB.getBlockLocations over masternode-1-devtest.ad.ashokdev.com/172.25.22.27:8020. Not retrying because try once and fail. org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/hive/warehouse/al_uat_test.db/al_vehicle_dim_orc/delta_0002123_0002123/bucket_00000_flush_length at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1860) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1831) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1744) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:693) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:373) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) Container logs for running mappers: ================================== 2017-03-23 10:43:40,785 [INFO] [TezChild] |orc.OrcInputFormat|: ORC pushdown predicate: leaf-0 = (IS_NULL vin), leaf-1 = (IS_NULL eve_date), expr = (and (not leaf-0) (not leaf-1)) 2017-03-23 10:43:40,793 [INFO] [TezChild] |orc.ReaderImpl|: Reading ORC rows from hdfs://clustername/apps/hive/warehouse/al_uat_test.db/al_alert_rpt_stg/part-00034_copy_404 with {include: [true], offset: 0, length: 49, sarg: leaf-0 = (IS_NULL vin), leaf-1 = (IS_NULL eve_date), expr = (and (not leaf-0) (not leaf-1)), columns: ['null']} 2017-03-23 10:43:40,795 [INFO] [TezChild] |exec.Utilities|: PLAN PATH = hdfs://clustername/tmp/hive/hdfs/9c507841-e6ad-40e5-a377-b906d808fd1e/hive_2017-03-23_10-21-02_680_6936083326764717474-1/hdfs/_tez_scratch_dir/6f9c581a-b6f4-4c59-82e1-fe59f16159d0/map.xml 2017-03-23 10:43:40,804 [INFO] [TezChild] |exec.Utilities|: PLAN PATH = hdfs://clustername/tmp/hive/hdfs/9c507841-e6ad-40e5-a377-b906d808fd1e/hive_2017-03-23_10-21-02_680_6936083326764717474-1/hdfs/_tez_scratch_dir/6f9c581a-b6f4-4c59-82e1-fe59f16159d0/map.xml 2017-03-23 10:43:40,809 [INFO] [TezChild] |exec.Utilities|: PLAN PATH = hdfs://clustername/tmp/hive/hdfs/9c507841-e6ad-40e5-a377-b906d808fd1e/hive_2017-03-23_10-21-02_680_6936083326764717474-1/hdfs/_tez_scratch_dir/6f9c581a-b6f4-4c59-82e1-fe59f16159d0/map.xml

3 REPLIES 3

New Contributor

Hi,

I am also facing the same issue. Let me know if your issue is resolved.

New Contributor

issue solved by updating" hive.tez.input.format= org.apache.hadoop.hive.ql.io.HiveInputFormat” to "hive.tez.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat" in ambari.