I did follow instructions very carefully, but received the following error:
-rw-rw-r-- 1 ec2-user ec2-user 9617928269 May 29 01:23 HDP_2.6_docker_05_05_2017_15_01_40.tar.gz -rw-rw-r-- 1 ec2-user ec2-user 2245 May 29 01:23 start_sandbox-hdp.sh [root@ip-172-31-17-10 hdp]# docker load -i /home/ec2-user/hdp/2.6/HDP_2.6_docker_05_05_2017_15_01_40.tar.gz b1b065555b8a: Loading layer [==================================================>] 202.2 MB/202.2 MB bcedae1b8073: Loading layer [================================> ] 8.254 GB/12.68 GB ApplyLayer exit status 1 stdout: stderr: unexpected EOF [root@ip-172-31-17-10 hdp]# md5sum /home/ec2-user/hdp/2.6/HDP_2.6_docker_05_05_2017_15_01_40.tar.gz 886845a5e2fc28f773c59dace548e516 /home/ec2-user/hdp/2.6/HDP_2.6_docker_05_05_2017_15_01_40.tar.gz
Can you try the following command,
docker load --input /home/ec2-user/hdp/2.6/HDP_2.6_docker_05_05_2017_15_01_40.tar.gz
If you face the same issue, then you have to download the tar file again and try.
Another possibility to that error can be the Operating System resources limitations (Like not having enough Disk Space or RAM). Specially the disk space need to be checked. Because the md5sum of your docker seems to be fine.
I think you were correct - there was not enough disk space on the instance. However when I resized the volume to 100gb, I got a different error:
[ec2-user@ip-172-31-17-10 2.6]$ docker load -i HDP_2.6_docker_05_05_2017_15_01_40.tar.gz devicemapper: Error running deviceCreate (createSnapDevice) dm_task_run failed
Good to know that the previous error is cleared up.
Regarding your new error:
devicemapper: Error running deviceCreate (createSnapDevice) dm_task_run failed
Your new error looks more specific to Docker issue. Can you please run the following commands to see if we get little more debug info?
# sudo docker -d AND # strace -o docker.log -f -s 128 docker -d
Well, it turns out that docker indeed got messed up running with lack of disk space. After rebuilding the instance the script worked. However when running start_sandbox-hdp.sh i got a message that ambari agent started with warning. Probably that's why I cannot login to ambari web interface. Hopefully it is a last hurdle. Could you please help? Unfortunately the logs mentioned in documentation: /var/log/ambari-agent do not exits on my instance.
Good to know that you are now able to start your Docker.
The next issue is related to unable to login to ambari UI, ambari agent warnings and lo log file generation. It would be really great if you can open a separate HCC thread for this issue. As it helps us in making a better community when each thread has a specific query with specific solutions, else the community users gets confused while reading a long thread with multiple issues resolved in one.
It would be also better if you can share the following details regarding the agent issue in the new thread.
- Ambari agent log (so that we can see the warning messages), Please confirm if agent is actually running?
# ambari-agent status # ps -ef | grep main.py # ambari-agent start (if agent is not running)
- Please check if there is any ERROR/WARNING in ambari server log (if yes then please share)
- Please check if ambari server is running and the port 8080 is opened?
# ambari-server status # netstat -tnlpa | grep 8080 # ambari-server start (if server is not running)
Turns out that there was something wrong with the docker. When i removed the container and reran: start_sandbox-hdp.sh, i was able to ssh into port 2222. Both ambari-server and ambari-agent commands are now working:
[root@sandbox ~]# ambari-server status Using python /usr/bin/python Ambari-server status Ambari Server running
Found Ambari Server PID: 483 at: /var/run/ambari-server/ambari-server.pid
[root@sandbox ~]# netstat -tnlpa | grep 8080 tcp 0 0 :::8080 :::* LISTEN 483/java
[root@sandbox ~]# ambari-agent status
Found ambari-agent PID: 4569 ambari-agent running.
Agent PID at: /var/run/ambari-agent/ambari-agent.pid
Agent out at: /var/log/ambari-agent/ambari-agent.out Agent log at: /var/log/ambari-agent/ambari-agent.log
[root@sandbox ~]# ps -ef | grep main.py
root 4569 4561 34 01:01 ? 00:02:49 /usr/bin/python /usr/lib/python2.6/site-packages/ambari_agent/main.py start
root 5568 5226 0 01:09 pts/0 00:00:00 grep main.py
I still cannot connect to http://ElasticIP:8888
I finally got it to work, but it seems that the script or container have problems. Please help me sort it out.
Per instructions I added script start-sandbox-hdp.sh to /etc/rc.local
However it does not look like the script automatically runs upon instance start. I know that because there is no ssh access to port 2222. When I try to run this script manually it complains that docker container named sandbox-hdp is already running. When I remove this container using: #docker rm <container_name>, the script runs fine except for warning about ambari-agent. Then I ssh to root@localhost -p 2222 and restart the agent: ambari-agent restart
Then everything works fine.