Created 11-23-2017 12:21 PM
Hello,
I tried to submit a spark job using YARN. But the application state remains UNDEFINED with the below error.
"ACCEPTED: waiting for AM container to be allocated, launched and register with RM."
When I checked the node status, the number of active nodes=0 and total memory=0. Is it because the datanodes are not detected by namenode?
When I checked the Name Node UI, it shows two data node information.
Please find attached the screen shots
Could anyone please help me to resolve this.
Thanks,
Nirmal J
screen-shot-2017-11-23-at-54836-pm.pngscreen-shot-2017-11-23-at-54419-pm.pngscreen-shot-2017-11-23-at-54352-pm.png
Created on 11-23-2017 01:57 PM - edited 08-17-2019 09:22 PM
Looks like all your NodeManagers are down or you dont have any nodemanagers. Go to Yarn and check if node managers are running or not. Start them if they are not running. If you do not have node managers add a node manager and start it.
Adding a Nodemanager: Click on Hosts -> Select a Host -> Add -> NodeManager.
Thanks,
Aditya
Created on 11-23-2017 01:57 PM - edited 08-17-2019 09:22 PM
Looks like all your NodeManagers are down or you dont have any nodemanagers. Go to Yarn and check if node managers are running or not. Start them if they are not running. If you do not have node managers add a node manager and start it.
Adding a Nodemanager: Click on Hosts -> Select a Host -> Add -> NodeManager.
Thanks,
Aditya
Created 11-23-2017 03:14 PM
Thanks for your quick response.
But the node managers are all up and running
Please find attached the screenshotsscreen-shot-2017-11-23-at-84336-pm.pngscreen-shot-2017-11-23-at-84200-pm.png
Thanks,
Nirmal J
Created 11-23-2017 04:18 PM
Created 11-23-2017 04:20 PM
This lists no nodes.
yarn node -list
17/11/23 16:19:24 INFO impl.TimelineClientImpl: Timeline service address: http://localhost:8188/ws/v1/timeline/
17/11/23 16:19:24 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8050
17/11/23 16:19:24 INFO client.AHSProxy: Connecting to Application History server at localhost/127.0.0.1:10200
Total Nodes:0
Node-Id Node-StateNode-Http-AddressNumber-of-Running-Containers
Created 11-23-2017 04:24 PM
Can you please change the ResourceManager , NodeManager, Timeline service address to proper hostnames instead of localhost and try restarting the services.
yarn.timeline-service.webapp.address: localhost:8188 to {timeline server hostname}:8188
yarn.timeline-service.address : localhost:10200 to {timeline server hostname}:10200
yarn.resourcemanager.address: localhost:8050 to {rmhostname}:8050
yarn.resourcemanager.hostname: localhost to {rmhostname}
Make sure to remove all localhost and replace with proper hostnames.
Created 11-23-2017 05:18 PM
Changing the ResourceManager , NodeManager, Timeline service address to proper hostnames instead of localhost worked.
Thanks a lot.
Nirmal J