Member since
06-25-2019
3
Posts
1
Kudos Received
0
Solutions
06-25-2019
05:18 PM
I am new to HDP Sandbox and also find it quite slow. Using the example csv file from the getting started tutorial (https://hortonworks.com/tutorial/hadoop-tutorial-getting-started-with-hdp/), following cell takes 9 Seconds to execute. %spark2 val geoLocationDataFrame = spark.read.format("csv").option("header", "true").load("hdfs:///tmp/data/geolocation.csv") geoLocationDataFrame.createOrReplaceTempView("geolocation") Took 9 sec. Last updated by anonymous at June 25 2019, 3:02:48 PM. I would expect only a view ms to load some csv file. My VM setup is based on VmWare and has 10 GB Ram and 4 x 3.2 GHz. Is there some benchmark with reference numbers to know expected execution times or some sort of profiling tool to be able to easily find bottlenecks?
... View more
06-25-2019
10:06 AM
Also asked the question here: https://community.hortonworks.com/idea/248333/please-include-current-python-version-373-in-hdp-s.html =>This questions can be deleted (Did not find how to do it)
... View more
06-25-2019
09:03 AM
1 Kudo
In order to be able to use Python 3 with Zeppelin I tried to chagne the setting zeppelin.pyspark.python = python to zeppelin.pyspark.python = python3 http://{sandboxserver}:9995/#/interpreter That did not work. Then I tried to undo my changes and now I get following error: PUT http://{sandboxserver}:9995/api/interpreter/setting/spark2 504 (Gateway Time-out) vendor.49d751b0c72342f6.js:37 => Could you please include python 3.7.3 in the Sandbox and enable it as default for Zeppelin instead of 2.7.5?
... View more
Labels:
- Labels:
-
Apache Spark