Member since
05-25-2018
10
Posts
0
Kudos Received
0
Solutions
08-07-2018
07:26 AM
yes I can add zip file along with path. But there are modules inside compressed folder. How do i Import this. On python I can do . I have __init__.py in all folders starting from second 'foo' import foo.apps.lib.math
... View more
08-06-2018
04:28 PM
I have a project structure like foo/foo/apps/lib/math.py and there are other .py files . When I do sc.addPyFile("foo.zip") How Can I import math.py or any class of of this.
... View more
06-13-2018
06:49 AM
I am trying to read csv file from S3 . variable url is set to some value. >>> DF = spark.read.load(url,
... format="com.databricks.spark.csv",
... header="true",
... inferschema="true",
... delimiter=",")
18/06/13 11:16:24 WARN DataSource: Error while looking for metadata directory.
Traceback (most recent call last):
File "<stdin>", line 5, in <module>
File "/opt/sw/spark-2.1.0-bin-hadoop2.7/python/pyspark/sql/readwriter.py", line 149, in load
return self._df(self._jreader.load(path))
File "/opt/sw/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/opt/sw/spark-2.1.0-bin-hadoop2.7/python/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
File "/opt/sw/spark-2.1.0-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o32.load.
: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3a.S3AFileSystem not found How can I fix this issue? You help is appreciated.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark
05-29-2018
06:23 AM
Using .collect method I am able to create a row object my_list[0] which is as shown below my_list[0]
Row(Specific Name/Path (to be updated)=u'Monitoring_Monitoring.csv') How Can I fetch row value . It gives synatx errors as there are spaces in row name.
... View more
05-25-2018
11:00 AM
I am moving to pyspark & zeppelin . I created a two notebook my_settings.py and main.py . But when I do following in main.py %pyspark
import my_settings
ImportError: No module named my_settings I get import error saying , no modules found. This works fine on my local server. I wonder if there's any env setting for this to work.
... View more
Labels:
- Labels:
-
Apache Zeppelin