Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Docker Sandbox 2.5 starts ok, but lots not working

avatar
Explorer

Hi guys

I'm just starting with HDP, and want to play with the Sandbox. I'm using an Ubuntu 14.04 VM with 11GB of RAM and the latest version of Docker. The computer is an i7 laptop with 16GB of RAM.

The container starts ok, and all of the HDP services appear to start OK, except for a warning from ambari-agent about a missing TERM variable. Available memory goes from 10GB to 2GB free, no swap.

I can connect to :8888, but when I try to use the Ambari quick link, it fails with "Connection reset".

I ssh'd into the docker container to see what's happening. When I check, I find that the ambari-server service is not running (and lots of other stuff too). When I try to start ambari-service manually, it fails when it tries to start PostgreSQL. When I try to start that manually, it fails with no error message.

I'm stuck. Anyone have any idea why the system seems to start OK, but parts of it die shortly after?

BTW, after various attempts to sort this out, I started with a fresh Docker container, plenty of RAM and swap, but the same symptoms: lots of dead services.

Thanks!

Cheers, Steve.

1 ACCEPTED SOLUTION

avatar
Explorer

BTW, I've found a workaround: I'm now running the Sandbox 2.5 docker image using Docker for Windows beta 34 with 11GB allocated on my Win10 Enterprise machine. Works flawlessly. All I needed to do was take the key part of the startup script and turn it into a CMD file.

It would still be interesting to find why the Docker image running in an Ubuntu 14.04 VM with the same memory allocation on Win10 Hyper-V didn't work properly. It's clearly not the image. May save someone else some anguish.

Thanks.

View solution in original post

18 REPLIES 18

avatar
Explorer

That's nice of Artem, but I told a fib. It turns out there was a problem with Docker for WIndows beta 34, and though I was asking for 11GB, I was only getting 6.5GB. Don't try running Zeppelin ... The Docker guys are looking into it.

So, the real answer is to use the latest stable release of DfW (1.12.6). I'm able to give Sandbox around 9.5GB on my 16GB Win10Ent laptop and all is sweet.

BTW, if anyone is interested in my StartSandbox and RestartSandbox cmd files, I'm happy to post them.

On a different topic: I'm a new HW trainer. Lester, thanks for your great demo notes and videos for the Essentials course. I'm working through it as I get up to speed on this beast.

Cheers, Steve.

avatar

Good stuff, Steve. I'm actually having a few issues on the Docker version of the Sandbox myself and I'm pulling down the VirtualBox one to see if there is a better experience there. I've successfully leveraged the Sandbox team's products since joining HW three years ago and I think it is a great service. As with anything with this level of complexity, there is almost always some issues somewhere. Kudos to the Sandbox team and I hope your issues get resolved asap. AND... GOOD LUCK on your upcoming teach!!

avatar
New Contributor

Hi Stephen,

Can you please share your .CMD script you used to run it on Windows? I tried to convert the .sh script, but I am getting an error (3490283c9c4665371b5ba055d0525233cecc2bd1fb7c6e341fb238a5527f1074 docker: Error response from daemon: driver failed programming external connectivity on endpoint sandbox (cd33427626aa468c9bc23f75bc013acc16d546c19dac8a1a7c5d3839519b181d): Error starting userland proxy: Bind for 0.0.0.0:10001: unexpected error Permission denied.). I am not sure if it's caused by my script or environment.

avatar
Expert Contributor

@Stephen Russell

I tried developing the image natively on my ubuntu 14.04 machine but ran into big issues with appArmor, that's probably the culprit.

avatar
Explorer

@glupu

Looks likely. It'll be interesting to see which versions of Ubuntu and Docker it does work on.

avatar
Expert Contributor

I don't think it's necessarily the right OS version, but configuring AppArmor so that they work together. I did attempt to but didn't have the time and gave up after about 2 hours and just started developing on centos which works from the get-go.

avatar
New Contributor

Cannot get sandbox services to run on docker/ ubuntu using the start_sandbox.sh script

Docker version: 17.03.0-ce, build 3a232c8

Linux ip-172-31-48-195 3.13.0-48-generic #80-Ubuntu SMP Thu Mar 12 11:16:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Starting Ambari server [ OK ] Starting Ambari agent [ OK ] Starting Flume [ OK ] Starting Postgre SQL [ OK ] Starting name node [ OK ] Starting Oozie [ OK ] Starting Zookeeper nodes [ OK ] Starting data node [ OK ] Starting Ranger-admin [ OK ]

17/03/07 03:29:39 WARN ipc.Client: Failed to connect to server: sandbox.hortonworks.com/172.17.0.2:8020: try once and fail. java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:650) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:745) at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1618) at org.apache.hadoop.ipc.Client.call(Client.java:1449) at org.apache.hadoop.ipc.Client.call(Client.java:1396) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at com.sun.proxy.$Proxy10.setSafeMode(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setSafeMode(ClientNamenodeProtocolTranslatorPB.java:711) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176) at com.sun.proxy.$Proxy11.setSafeMode(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.setSafeMode(DFSClient.java:2657) at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1340) at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1324) at org.apache.hadoop.hdfs.tools.DFSAdmin.setSafeMode(DFSAdmin.java:611) at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:1916) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2107) 17/03/07 03:29:39 WARN retry.RetryInvocationHandler: Exception while invoking ClientNamenodeProtocolTranslatorPB.setSafeMode over null. Not retrying because try once and fail. java.net.ConnectException: Call From sandbox.hortonworks.com/172.17.0.2 to sandbox.hortonworks.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:801) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:732) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1556) at org.apache.hadoop.ipc.Client.call(Client.java:1496) at org.apache.hadoop.ipc.Client.call(Client.java:1396) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at com.sun.proxy.$Proxy10.setSafeMode(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.setSafeMode(ClientNamenodeProtocolTranslatorPB.java:711) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176) at com.sun.proxy.$Proxy11.setSafeMode(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.setSafeMode(DFSClient.java:2657) at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1340) at org.apache.hadoop.hdfs.DistributedFileSystem.setSafeMode(DistributedFileSystem.java:1324) at org.apache.hadoop.hdfs.tools.DFSAdmin.setSafeMode(DFSAdmin.java:611) at org.apache.hadoop.hdfs.tools.DFSAdmin.run(DFSAdmin.java:1916) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at org.apache.hadoop.hdfs.tools.DFSAdmin.main(DFSAdmin.java:2107) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:650) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:745) at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1618) at org.apache.hadoop.ipc.Client.call(Client.java:1449) ... 20 more safemode: Call From sandbox.hortonworks.com/172.17.0.2 to sandbox.hortonworks.com:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused make: [datanode] Error 255 (ignored) Starting

Hdfs nfs [ OK ] Starting NFS portmap [ OK ] Starting Hive server [ OK ] Starting Hiveserver2 [ OK ] Starting Webhcat server [WARNINGS] /usr/hdp/2.5.0.0-1245/hive-hcatalog/sbin/webhcat_server.sh: already running on process 2020 Starting Ranger-usersync [ OK ] Starting Node manager [ OK ] Starting Yarn history server [ OK ]

At resource manager it goes into endless loop with connection warnings.

All TCP ports are open. Any pointers ?

,

avatar
Explorer

Hi Puneet,

As @glupu said, the Sandbox doesn't work on Ubuntu due to interference from AppArmor. Like him, I gave up trying to make it work and switched to Docker for Windows. I suggest you try CentOS if you want a Linux base.

Cheers, Steve.

avatar
New Contributor

Thank you Chris. Yes, gave up on Ubuntu and moved to CentOS 7.5 and it works there.

Maybe disabling AppArmor could have worked, but that was not the focus. Maybe next weekend I will give that a try.