Member since
08-04-2017
6
Posts
0
Kudos Received
0
Solutions
05-08-2019
04:34 AM
@jtaras, thank you for this wonderful article! I have a follow-up question (just curious) : do Spark workers use the same custom Python path as Zeppelin driver? If yes, how do they know it? (the Zeppelin UI setting seems to apply to the Zeppelin driver application only and its SparkContext) If no, why is there no version conflict between the driver's python and the workers' python? (I tested with the default of 2.X and the custom Anaconda python of 3.7) NOTE: I had to additionally set the PYSPARK_PYTHON environment variable in Spark > Config to the same path in Ambari, in order to be able to use all the Python libraries in the scripts submitted via "spark-submit", but this does not affect the Zeppelin functionality in any way.
... View more
05-06-2019
09:27 PM
Thanks for posting this answer, Subhash. I thought I was going crazy, because I was using HDP 2.X sandbox before and that's all what one had to do to query HIVE-created tables from Spark: %pyspark
from pyspark.sql import HiveContext
hive_context = HiveContext(sc)
hive_context.sql('show tables').show() Many posts out there say that not having ./conf/hive-site.xml may be the problem, but it DOES exist on the HDP 3.0.1 sandbox, while HiveContext still shows only Spark-created tables (both permanent and temporary). So I have a follow-up question: Is the value "thrift://sandbox-hdp.hortonworks.com:9083" correct for the property "hive.metastore.uris" in ./conf/hive-site.xml? Here's the entry for this host from /etc/hosts: 172.18.0.2 sandbox-hdp.hortonworks.com sandbox-hdp Is this the IP address of one of the hosts in the virtual cluster? (I am using HDP Sandbox 3.0.1) Changing the host IP address to 172.0.0.1 (same as localhost) results in an error while trying to create and use HiveContext - connection refused, probably meaning that the Thrift server is NOT running on port number 9083 of the sandbox VM ???
... View more
08-06-2017
01:33 PM
I am running HDP version 2.6.1 on VMware. Initially it starts up fine, but any of my attempts to shutdown the "docker" container for the sandox gracefully failed, and just shutting down the host VM via the VMware Player window resulted in the SandboxContainer startup hanging at the next startup. Sending shutdown commands to the docker OS (NOT to the host VM OS) via SSH does not work because of some permission issues. I ended up starting up the VM initially and never shutting it down, only suspending. Will stick with this workaround until I stumble upon a way to shutdown the sandbox containter gracefully and only THEN shutting down the host VM. I miss the days when the sandbox was running on just a plain VM instead of this VM+docker combo. You can't even see what happens when the docker is being started...
... View more
08-06-2017
02:08 AM
I have the same issue with 2.6. I found a WORKAROUND for this problem by suspending the VM and NEVER shutting it down. The "docker" container that runs within the VM and that hosts the sandbox (it's not the VM directly any more, unfortunately) seems to be really sensitive to the way you shut down the system. In order to actually solve this problem Hortonworks needs to provide a tested reliable way to shut down all the Hadoop services first, then the docker, then the VM itself as a part of the "Learning the Ropes" tutorial.
... View more
08-06-2017
12:38 AM
Running HDP sandbox 2.6.1 on VMware.
Can't execute the shutdown commands for the docker in Step 2 via SSH, with the same error messages that @Bob Heckel is getting. The VMware Player window is useless, since it's interacting with the host VM, not with the hosted docker container for the sandbox. Now my sandbox appears to be corrupted again, the host VM startup process is getting stuck on starting up the docker. I am sure there was an architectural justification to migrate to using VM + docker vs. just VM, but it's not working out very well for users.
... View more