Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Please see the Cloudera blog for information on the Cloudera Response to CVE-2021-4428

Pyspark missing hadoop_currentpath

New Contributor

I followed the instructions on http://hortonworks.com/hadoop-tutorial/using-ipython-notebook-with-apache-spark/ and was able to get the iPython working in the web browser. I can import pydoop, however pydoop.hdfs does not work. The pydoop.hadoop_currentpath() command in the jupyter notebook returns a list of .jar files that is shorter than when I run the same command from a python line in an ssh shell. Where does pyspark get the path from and how do I fix it so that I can run pydoop.hdfs commands from pyspark?

1 REPLY 1

@Martin Madsen

Have you addressed this issue? If so, what was your solution?

I checked the tutorial mentioned in the question and all steps were successful.

It would be helpful if you could describe the steps that you executed after this tutorial.