Member since
05-18-2017
6
Posts
1
Kudos Received
0
Solutions
12-31-2018
03:12 PM
1 Kudo
i am also getting same error. cloudgpu-server:~/HDP#
yarn jar
/usr/hdp/3.1.0.0-78/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar
-jar
/usr/hdp/3.1.0.0-78/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar
-shell_command /usr/bin/nvidia-smi -container_resources memory-mb=3072,vcores=1,yarn.io/gpu=1
-num_containers 2 18/12/31 17:04:34
INFO distributedshell.Client: Initializing Client 18/12/31 17:04:34
INFO distributedshell.Client: Running Client 18/12/31 17:04:34
INFO client.RMProxy: Connecting to ResourceManager at
hostname/<ip_address>:8050 18/12/31 17:04:35
INFO client.AHSProxy: Connecting to Application History server at
<hostname>:/<ip_address>:10200 18/12/31 17:04:35
INFO distributedshell.Client: Got Cluster metric info from ASM,
numNodeManagers=4 18/12/31 17:04:35
INFO distributedshell.Client: Got Cluster node info from ASM 18/12/31 17:04:35
INFO distributedshell.Client: Got node report from ASM for,
nodeId=cloudgpu-server.com:45454, nodeAddress=cloudgpu-server.com:8042,
nodeRackName=/default-rack, nodeNumContainers=0 18/12/31 17:04:35
INFO distributedshell.Client: Got node report from ASM for,
nodeId=<hostname>:45454, nodeAddress=<hostname>:8042,
nodeRackName=/default-rack, nodeNumContainers=0 18/12/31 17:04:35
INFO distributedshell.Client: Got node report from ASM for,
nodeId=<hostname>:45454, nodeAddress=<hostname>:8042,
nodeRackName=/default-rack, nodeNumContainers=1 18/12/31 17:04:35
INFO distributedshell.Client: Got node report from ASM for,
nodeId=<hostname>::45454, nodeAddress=<hostname>::8042,
nodeRackName=/default-rack, nodeNumContainers=0 18/12/31 17:04:35
INFO distributedshell.Client: Queue info, queueName=default,
queueCurrentCapacity=0.03125, queueMaxCapacity=1.0, queueApplicationCount=1,
queueChildQueueCount=0 18/12/31 17:04:35
INFO distributedshell.Client: User ACL Info for Queue, queueName=root,
userAcl=SUBMIT_APPLICATIONS 18/12/31 17:04:35
INFO distributedshell.Client: User ACL Info for Queue, queueName=root,
userAcl=ADMINISTER_QUEUE 18/12/31 17:04:35
INFO distributedshell.Client: User ACL Info for Queue, queueName=default,
userAcl=SUBMIT_APPLICATIONS 18/12/31 17:04:35
INFO distributedshell.Client: User ACL Info for Queue, queueName=default,
userAcl=ADMINISTER_QUEUE 18/12/31 17:04:35
INFO distributedshell.Client: Max mem capability of resources in this cluster
8192 18/12/31 17:04:35
INFO distributedshell.Client: Max virtual cores capability of resources in this
cluster 38 18/12/31 17:04:35
WARN distributedshell.Client: AM Memory not specified, use 100 mb as AM memory 18/12/31 17:04:35
WARN distributedshell.Client: AM vcore not specified, use 1 mb as AM vcores 18/12/31 17:04:35
WARN distributedshell.Client: AM Resource capability=<memory:100,
vCores:1> 18/12/31 17:04:35
ERROR distributedshell.Client: Error running Client org.apache.hadoop.yarn.exceptions.ResourceNotFoundException:
Unknown resource: yarn.io/gpu at
org.apache.hadoop.yarn.applications.distributedshell.Client.validateResourceTypes(Client.java:1218) at
org.apache.hadoop.yarn.applications.distributedshell.Client.setContainerResources(Client.java:1204) at
org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:735) at
org.apache.hadoop.yarn.applications.distributedshell.Client.main(Client.java:265) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at
java.lang.reflect.Method.invoke(Method.java:498) at
org.apache.hadoop.util.RunJar.run(RunJar.java:318) at
org.apache.hadoop.util.RunJar.main(RunJar.java:232) root@cloudgpu-server:~/HDP# i have followed up steps https://hortonworks.com/blog/gpus-support-in-apache-hadoop-3-1-yarn-hdp-3/#comment-26766 Can any one advise this ?
... View more
11-23-2018
06:24 AM
i have installed HDP 2.5.3 version and using kafka in 3 brokers but i had stopped HDP cluster due to maintenance activity and later when i started cluster i am getting below kafka consumer errors. please help me metadata.broker.list=hnode3.com:6667,hnode1.com:6667,hnode2.com:6667, request.timeout.ms=30000, client.id=console-consumer-85300, security.protocol=PLAINTEXT}
[2018-11-23 11:43:41,711] WARN [console-consumer-85300_hnode2.com-1542953611274-fa71af73-leader-finder-thread], Failed to add leader for partitions [firsttopic1,1],[firsttopic1,0]; will retry (kafka.consumer.ConsumerFetcherManager$LeaderFinderThread)
kafka.common.NotLeaderForPartitionException
at sun.reflect.GeneratedConstructorAccessor1.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)
at kafka.common.ErrorMapping$.exceptionFor(ErrorMapping.scala:105)
at kafka.consumer.SimpleConsumer.earliestOrLatestOffset(SimpleConsumer.scala:207)
at kafka.consumer.ConsumerFetcherThread.handleOffsetOutOfRange(ConsumerFetcherThread.scala:86)
at kafka.server.AbstractFetcherThread$anonfun$addPartitions$2.apply(AbstractFetcherThread.scala:192)
at kafka.server.AbstractFetcherThread$anonfun$addPartitions$2.apply(AbstractFetcherThread.scala:187)
at scala.collection.TraversableLike$WithFilter$anonfun$foreach$1.apply(TraversableLike.scala:772)
at scala.collection.immutable.Map$Map1.foreach(Map.scala:109)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at kafka.server.AbstractFetcherThread.addPartitions(AbstractFetcherThread.scala:187)
at kafka.server.AbstractFetcherManager$anonfun$addFetcherForPartitions$2.apply(AbstractFetcherManager.scala:88)
at kafka.server.AbstractFetcherManager$anonfun$addFetcherForPartitions$2.apply(AbstractFetcherManager.scala:78)
at scala.collection.TraversableLike$WithFilter$anonfun$foreach$1.apply(TraversableLike.scala:772)
at scala.collection.immutable.Map$Map2.foreach(Map.scala:130)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
at kafka.server.AbstractFetcherManager.addFetcherForPartitions(AbstractFetcherManager.scala:78)
at kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:97)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
... View more
Labels:
11-02-2018
01:57 AM
which is mentioned above 13th post
... View more