Member since
05-18-2017
6
Posts
1
Kudos Received
0
Solutions
12-31-2018
03:12 PM
1 Kudo
i am also getting same error. cloudgpu-server:~/HDP#
yarn jar
/usr/hdp/3.1.0.0-78/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar
-jar
/usr/hdp/3.1.0.0-78/hadoop-yarn/hadoop-yarn-applications-distributedshell.jar
-shell_command /usr/bin/nvidia-smi -container_resources memory-mb=3072,vcores=1,yarn.io/gpu=1
-num_containers 2 18/12/31 17:04:34
INFO distributedshell.Client: Initializing Client 18/12/31 17:04:34
INFO distributedshell.Client: Running Client 18/12/31 17:04:34
INFO client.RMProxy: Connecting to ResourceManager at
hostname/<ip_address>:8050 18/12/31 17:04:35
INFO client.AHSProxy: Connecting to Application History server at
<hostname>:/<ip_address>:10200 18/12/31 17:04:35
INFO distributedshell.Client: Got Cluster metric info from ASM,
numNodeManagers=4 18/12/31 17:04:35
INFO distributedshell.Client: Got Cluster node info from ASM 18/12/31 17:04:35
INFO distributedshell.Client: Got node report from ASM for,
nodeId=cloudgpu-server.com:45454, nodeAddress=cloudgpu-server.com:8042,
nodeRackName=/default-rack, nodeNumContainers=0 18/12/31 17:04:35
INFO distributedshell.Client: Got node report from ASM for,
nodeId=<hostname>:45454, nodeAddress=<hostname>:8042,
nodeRackName=/default-rack, nodeNumContainers=0 18/12/31 17:04:35
INFO distributedshell.Client: Got node report from ASM for,
nodeId=<hostname>:45454, nodeAddress=<hostname>:8042,
nodeRackName=/default-rack, nodeNumContainers=1 18/12/31 17:04:35
INFO distributedshell.Client: Got node report from ASM for,
nodeId=<hostname>::45454, nodeAddress=<hostname>::8042,
nodeRackName=/default-rack, nodeNumContainers=0 18/12/31 17:04:35
INFO distributedshell.Client: Queue info, queueName=default,
queueCurrentCapacity=0.03125, queueMaxCapacity=1.0, queueApplicationCount=1,
queueChildQueueCount=0 18/12/31 17:04:35
INFO distributedshell.Client: User ACL Info for Queue, queueName=root,
userAcl=SUBMIT_APPLICATIONS 18/12/31 17:04:35
INFO distributedshell.Client: User ACL Info for Queue, queueName=root,
userAcl=ADMINISTER_QUEUE 18/12/31 17:04:35
INFO distributedshell.Client: User ACL Info for Queue, queueName=default,
userAcl=SUBMIT_APPLICATIONS 18/12/31 17:04:35
INFO distributedshell.Client: User ACL Info for Queue, queueName=default,
userAcl=ADMINISTER_QUEUE 18/12/31 17:04:35
INFO distributedshell.Client: Max mem capability of resources in this cluster
8192 18/12/31 17:04:35
INFO distributedshell.Client: Max virtual cores capability of resources in this
cluster 38 18/12/31 17:04:35
WARN distributedshell.Client: AM Memory not specified, use 100 mb as AM memory 18/12/31 17:04:35
WARN distributedshell.Client: AM vcore not specified, use 1 mb as AM vcores 18/12/31 17:04:35
WARN distributedshell.Client: AM Resource capability=<memory:100,
vCores:1> 18/12/31 17:04:35
ERROR distributedshell.Client: Error running Client org.apache.hadoop.yarn.exceptions.ResourceNotFoundException:
Unknown resource: yarn.io/gpu at
org.apache.hadoop.yarn.applications.distributedshell.Client.validateResourceTypes(Client.java:1218) at
org.apache.hadoop.yarn.applications.distributedshell.Client.setContainerResources(Client.java:1204) at
org.apache.hadoop.yarn.applications.distributedshell.Client.run(Client.java:735) at
org.apache.hadoop.yarn.applications.distributedshell.Client.main(Client.java:265) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at
java.lang.reflect.Method.invoke(Method.java:498) at
org.apache.hadoop.util.RunJar.run(RunJar.java:318) at
org.apache.hadoop.util.RunJar.main(RunJar.java:232) root@cloudgpu-server:~/HDP# i have followed up steps https://hortonworks.com/blog/gpus-support-in-apache-hadoop-3-1-yarn-hdp-3/#comment-26766 Can any one advise this ?
... View more
11-02-2018
01:57 AM
which is mentioned above 13th post
... View more