Member since
01-09-2017
17
Posts
0
Kudos Received
0
Solutions
11-30-2017
07:55 AM
hdp 2.3 and hive 1.2 the hive.enforce.bucketing is default true What is the need to set?
... View more
11-30-2017
07:33 AM
my hdp is 2.3 hive 1.2 sql union all itself use tez and orc is right bug use mr is 0 this is my ddl CREATE TABLE `test.web`
( `id` string , `uid` string , `user_id` int )
PARTITIONED BY (`p_date` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ','
MAP KEYS TERMINATED BY ':'
LINES TERMINATED BY '\n'
NULL DEFINED AS ''
STORED AS ORC
TBLPROPERTIES('orc.compress'='SNAPPY') sql SELECT
count(*)
FROM
(
SELECT
id,
user_id
FROM
test.web
WHERE
p_date = 20171129
AND user_id > 0
UNION ALL
SELECT
id,
user_id
FROM
test.web
WHERE
p_date = 20171129
AND stat_id = 'adm'
AND user_id > 0
) a hive 1.2 hive.enforce.bucketing default is true Do need other parameters?
... View more
11-30-2017
07:24 AM
hive 1.2 hive.enforce.bucketing default is true Do need other parameters?
... View more
11-30-2017
07:20 AM
my hdp is 2.3 hive 1.2 sql union all itself use tez and orc is right bug use mr is 0 this is my ddl CREATE TABLE `test.web`
( `id` string , `uid` string , `user_id` int )
PARTITIONED BY (`p_date` string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
COLLECTION ITEMS TERMINATED BY ','
MAP KEYS TERMINATED BY ':'
LINES TERMINATED BY '\n'
NULL DEFINED AS ''
STORED AS ORC
TBLPROPERTIES('orc.compress'='SNAPPY') sql SELECT
count(*)
FROM
(
SELECT
id,
user_id
FROM
test.web
WHERE
p_date = 20171129
AND user_id > 0
UNION ALL
SELECT
id,
user_id
FROM
test.web
WHERE
p_date = 20171129
AND stat_id = 'adm'
AND user_id > 0
) a
can help me ??
... View more
09-25-2017
09:27 AM
use tez and mr join self output difference result example sql select * from (
select a,b form a
) t1 left join (
select b,c from a
) t2 on t1.b = t2.b where t1.a = "300" mr result : 0 tez result : 1 https://community.hortonworks.com/questions/62918/hive-mr-vs-tez-difference-in-output-hi.html but my hive.enforce.bucketing is true hdp 2.3.4 hive 1.2.1 tez 0.7 how can help ?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Tez
06-05-2017
07:40 AM
I have solved this problem
... View more
06-05-2017
07:37 AM
Resource Queue Configuration Error Fair queue configuration error I found that the maximum value of the queue is related to the minimum resource unit
... View more
06-03-2017
01:32 PM
I found a problem The same sql at the same time 6 times, There will be half of the probability of failure My personal task is related to the resource allocation of yarn,The task of waiting for no resources will be killed The following is my configuration. Please help look down yarn.scheduler.minimum-allocation-mb=3072M tez.am.resource.memory.mb=3072 tez.task.resource.memory.mb=3072 hive.tez.container.size=3072 tez.container.max.java.heap.fraction=0.8 tez.am.grouping.split-wave=1.4 These are error logs Vertex failed, vertexName=Map 1, vertexId=vertex_1496317022433_21566_2_05, diagnostics=
Vertex vertex_1496317022433_21566_2_05
Map 1
killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: login_data initializer failed, vertex=vertex_1496317022433_21566_2_05
Map 1
java.lang.IllegalArgumentException: Illegal Capacity: -1
at java.util.ArrayList.(ArrayList.java:142)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:332)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:305)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:407)
at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:273)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:266)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Vertex failed, vertexName=Map 5, vertexId=vertex_1496317022433_21566_2_01, diagnostics=
Vertex vertex_1496317022433_21566_2_01
Map 5
killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: applet_version_tbl initializer failed, vertex=vertex_1496317022433_21566_2_01
Map 5
java.lang.IllegalArgumentException: Illegal Capacity: -1
at java.util.ArrayList.(ArrayList.java:142)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:332)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:305)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:407)
at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:273)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:266)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Vertex failed, vertexName=Map 11, vertexId=vertex_1496317022433_21566_2_09, diagnostics=
Vertex vertex_1496317022433_21566_2_09
Map 11
killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: dwb_biz_msg_user_opt_ds initializer failed, vertex=vertex_1496317022433_21566_2_09
Map 11
java.lang.IllegalArgumentException: Illegal Capacity: -1
at java.util.ArrayList.(ArrayList.java:142)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:332)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:305)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:407)
at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:273)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:266)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Vertex failed, vertexName=Map 13, vertexId=vertex_1496317022433_21566_2_06, diagnostics=
Vertex vertex_1496317022433_21566_2_06
Map 13
killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: dim_oth_pub_date initializer failed, vertex=vertex_1496317022433_21566_2_06
Map 13
java.lang.IllegalArgumentException: Illegal Capacity: -1
at java.util.ArrayList.(ArrayList.java:142)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:332)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:305)
at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:407)
at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:155)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:273)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:266)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:266)
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Vertex killed, vertexName=Reducer 8, vertexId=vertex_1496317022433_21566_2_03, diagnostics=
Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_03
Reducer 8
killed/failed due to:OTHER_VERTEX_FAILURE
Vertex killed, vertexName=Map 10, vertexId=vertex_1496317022433_21566_2_02, diagnostics=
Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_02
Map 10
killed/failed due to:OTHER_VERTEX_FAILURE
Vertex killed, vertexName=Reducer 9, vertexId=vertex_1496317022433_21566_2_04, diagnostics=
Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_04
Reducer 9
killed/failed due to:OTHER_VERTEX_FAILURE
Vertex killed, vertexName=Map 6, vertexId=vertex_1496317022433_21566_2_00, diagnostics=
Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_00
Map 6
killed/failed due to:OTHER_VERTEX_FAILURE
Vertex killed, vertexName=Reducer 2, vertexId=vertex_1496317022433_21566_2_11, diagnostics=
Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_11
Reducer 2
killed/failed due to:OTHER_VERTEX_FAILURE
Vertex killed, vertexName=Reducer 12, vertexId=vertex_1496317022433_21566_2_10, diagnostics=
Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_10
Reducer 12
killed/failed due to:OTHER_VERTEX_FAILURE
Vertex killed, vertexName=Reducer 4, vertexId=vertex_1496317022433_21566_2_13, diagnostics=
Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_13
Reducer 4
killed/failed due to:OTHER_VERTEX_FAILURE
Vertex killed, vertexName=Reducer 3, vertexId=vertex_1496317022433_21566_2_12, diagnostics=
Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_12
Reducer 3
killed/failed due to:OTHER_VERTEX_FAILURE
Vertex killed, vertexName=Map 14, vertexId=vertex_1496317022433_21566_2_07, diagnostics=
Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_07
Map 14
killed/failed due to:OTHER_VERTEX_FAILURE
Vertex killed, vertexName=Reducer 15, vertexId=vertex_1496317022433_21566_2_08, diagnostics=
Vertex received Kill in INITED state., Vertex vertex_1496317022433_21566_2_08
Reducer 15
killed/failed due to:OTHER_VERTEX_FAILURE
DAG did not succeed due to VERTEX_FAILURE. failedVertices:4 killedVertices:10
You can see the failure of 4 times yarn log org.apache.tez.dag.app.dag.impl.AMUserCodeException: java.lang.IllegalArgumentException: Illegal Capacity: -1
at org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallback.onFailure(RootInputInitializerManager.java:319)
at com.google.common.util.concurrent.Futures$6.run(Futures.java:977)
at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:253)
at com.google.common.util.concurrent.ExecutionList$RunnableExecutorPair.execute(ExecutionList.java:149)
at com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:134)
at com.google.common.util.concurrent.ListenableFutureTask.done(ListenableFutureTask.java:86)
at java.util.concurrent.FutureTask.finishCompletion(FutureTask.java:380)
at java.util.concurrent.FutureTask.setException(FutureTask.java:247)
at java.util.concurrent.FutureTask.run(FutureTask.java:267)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
yarn Log Aggregation Status TIME_OUT I observed the
phenomenon of multiple tasks to run the resources used, the back of the
task without resources, has been waiting for resources, and then was
killed There is a problem here is no resources, the task should be queued, rather than can get to the resources So, is i somewhere in the wrong configuration? Please see if I need to provide additional information
... View more
Labels:
- Labels:
-
Apache Tez
-
Apache YARN
04-13-2017
07:08 AM
Who can give me advice ? now i tcpkill port
... View more
04-11-2017
12:39 PM
I found that after a period of time running there will be a lot of tcp blocking package phenomenon One of the machines yarn nodemanager process is 34675 jps
34675 NodeManager
netstat -anp | grep 34675 | grep 50010
tcp 100799 0 ::ffff:xxx.xx.xx.153:57938 ::ffff:xxx.xx.xx.29:50010 ESTABLISHED 34675/java
tcp 76376 0 ::ffff:xxx.xx.xx.153:50020 ::ffff:xxx.xx.xx.206:50010 ESTABLISHED 34675/java
tcp 0 0 ::ffff:xxx.xx.xx.153:36182 ::ffff:xxx.xx.xx.161:50010 ESTABLISHED 34675/java
tcp 70584 0 ::ffff:xxx.xx.xx.153:33285 ::ffff:xxx.xx.xx.202:50010 ESTABLISHED 34675/java
tcp 1301872 0 ::ffff:xxx.xx.xx.153:50534 ::ffff:xxx.xx.xx.22:50010 ESTABLISHED 34675/java
tcp 73736 0 ::ffff:xxx.xx.xx.153:45629 ::ffff:xxx.xx.xx.130:50010 ESTABLISHED 34675/java
tcp 145406 0 ::ffff:xxx.xx.xx.153:56123 ::ffff:xxx.xx.xx.57:50010 ESTABLISHED 34675/java
tcp 165896 0 ::ffff:xxx.xx.xx.153:54038 ::ffff:xxx.xx.xx.36:50010 ESTABLISHED 34675/java
tcp 154952 0 ::ffff:xxx.xx.xx.153:55024 ::ffff:xxx.xx.xx.25:50010 ESTABLISHED 34675/java
tcp 1 0 ::ffff:xxx.xx.xx.153:39984 ::ffff:xxx.xx.xx.24:50010 CLOSE_WAIT 34675/java
tcp 1 0 ::ffff:xxx.xx.xx.153:42582 ::ffff:xxx.xx.xx.35:50010 CLOSE_WAIT 34675/java
tcp 93752 0 ::ffff:xxx.xx.xx.153:54546 ::ffff:xxx.xx.xx.125:50010 ESTABLISHED 34675/java
tcp 88472 0 ::ffff:xxx.xx.xx.153:53022 ::ffff:xxx.xx.xx.34:50010 ESTABLISHED 34675/java
tcp 72416 0 ::ffff:xxx.xx.xx.153:54486 ::ffff:xxx.xx.xx.123:50010 ESTABLISHED 34675/java
tcp 197752 0 ::ffff:xxx.xx.xx.153:51549 ::ffff:xxx.xx.xx.204:50010 ESTABLISHED 34675/java
tcp 1 0 ::ffff:xxx.xx.xx.153:60444 ::ffff:xxx.xx.xx.49:50010 CLOSE_WAIT 34675/java
tcp 1 0 ::ffff:xxx.xx.xx.153:50642 ::ffff:xxx.xx.xx.44:50010 CLOSE_WAIT 34675/java
tcp 1 0 ::ffff:xxx.xx.xx.153:49902 ::ffff:xxx.xx.xx.37:50010 CLOSE_WAIT 34675/java
tcp 71776 0 ::ffff:xxx.xx.xx.153:35512 ::ffff:xxx.xx.xx.29:50010 ESTABLISHED 34675/java
You can see that there are a lot of problems, Many have been port connection has died There are a lot of reports in the yarn nodemanager log 2017-04-11 19:30:02,040 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
2017-04-11 19:30:02,042 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
2017-04-11 19:30:02,046 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
2017-04-11 19:30:02,047 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
2017-04-11 19:30:02,059 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
2017-04-11 19:30:02,059 ERROR mapred.ShuffleHandler (ShuffleHandler.java:exceptionCaught(1200)) - Shuffle error:
java.io.ioexception broken pipe
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:433)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:565)
at org.jboss.netty.channel.DefaultFileRegion.transferTo(DefaultFileRegion.java:68)
at org.apache.hadoop.mapred.FadvisedFileRegion.transferTo(FadvisedFileRegion.java:81)
at org.jboss.netty.channel.socket.nio.SocketSendBufferPool$FileSendBuffer.transferTo(SocketSendBufferPool.java:331)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.write0(AbstractNioWorker.java:198)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromSelectorLoop(AbstractNioWorker.java:157)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:113)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:88)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2017-04-11 19:30:02,063 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
2017-04-11 19:30:02,064 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
2017-04-11 19:30:02,065 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
2017-04-11 19:30:02,066 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
2017-04-11 19:30:02,070 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
2017-04-11 19:30:02,071 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
2017-04-11 19:30:02,074 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
2017-04-11 19:30:02,074 INFO mapred.ShuffleHandler (ShuffleHandler.java:setResponseHeaders(1047)) - Setting connection close header...
I tried a lot of changes, or the same and The cluster will get slower and slower Made changes 1、/proc/sys/net/core/somaxconn 204800 2、increase dfs.datanode.max.transfer.threads 16384 3、increase nodemanager heapsize and rourcemanager heapsize 2G My system is ambari 2.2.1 hdp 2.3 centos 6.8
Please help me what you need to provide
... View more
- Tags:
- Hadoop Core
- HDFS
- YARN
Labels:
- Labels:
-
Apache Hadoop
-
Apache YARN