Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

cannot run pyspark (not using interactive shell) on cloudera vm

SOLVED Go to solution

cannot run pyspark (not using interactive shell) on cloudera vm

New Contributor

Dear cloudera community, 

 

When I follow this example: http://www.cloudera.com/documentation/enterprise/5-5-x/topics/spark_develop_run.html and try to use the command spark-submit within the cloudera vm envirionrment, I constantly get the following error:

 

ERROR spark.SparkContext: Error initializing SparkContext.
org.apache.hadoop.security.AccessControlException: Permission denied: user=cloudera, access=WRITE, inode="/user/spark/applicationHistory":spark:supergroup:drwxr-xr-x

....

Traceback (most recent call last):
File "/home/cloudera/wordcount.py", line 9, in <module>
sc = SparkContext(conf=conf)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 115, in __init__
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 172, in _do_init
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 235, in _initialize_context
File "/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 1064, in __call__
File "/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: org.apache.hadoop.security.AccessControlException: Permission denied: user=cloudera, access=WRITE, inode="/user/spark/applicationHistory":spark:supergroup:drwxr-xr-x

 

I have tried these two commands:

1, $ spark-submit --master yarn --deploy-mode client --executor-memory 1g \ --name wordcount --conf "spark.app.id=wordcount" wordcount.py hdfs://namenode_host:8020/path/to/inputfile.txt 2

 

2, $ spark-submit --master yarn --deploy-mode client --executor-memory 1g \ --name wordcount --conf "spark.app.id=wordcount" wordcount.py inputfile.txt 2

 

Can somebody help?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: cannot run pyspark (not using interactive shell) on cloudera vm

New Contributor
1 REPLY 1
Highlighted

Re: cannot run pyspark (not using interactive shell) on cloudera vm

New Contributor