- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
cannot run pyspark (not using interactive shell) on cloudera vm
- Labels:
-
Apache Hadoop
-
Apache Spark
-
Apache YARN
-
HDFS
-
Security
Created on ‎07-27-2016 05:31 PM - edited ‎09-16-2022 03:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear cloudera community,
When I follow this example: http://www.cloudera.com/documentation/enterprise/5-5-x/topics/spark_develop_run.html and try to use the command spark-submit within the cloudera vm envirionrment, I constantly get the following error:
ERROR spark.SparkContext: Error initializing SparkContext.
org.apache.hadoop.security.AccessControlException: Permission denied: user=cloudera, access=WRITE, inode="/user/spark/applicationHistory":spark:supergroup:drwxr-xr-x
....
Traceback (most recent call last):
File "/home/cloudera/wordcount.py", line 9, in <module>
sc = SparkContext(conf=conf)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 115, in __init__
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 172, in _do_init
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/context.py", line 235, in _initialize_context
File "/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 1064, in __call__
File "/usr/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: org.apache.hadoop.security.AccessControlException: Permission denied: user=cloudera, access=WRITE, inode="/user/spark/applicationHistory":spark:supergroup:drwxr-xr-x
I have tried these two commands:
1, $ spark-submit --master yarn --deploy-mode client --executor-memory 1g \ --name wordcount --conf "spark.app.id=wordcount" wordcount.py hdfs://namenode_host:8020/path/to/inputfile.txt 2
2, $ spark-submit --master yarn --deploy-mode client --executor-memory 1g \ --name wordcount --conf "spark.app.id=wordcount" wordcount.py inputfile.txt 2
Can somebody help?
Thanks!
Created ‎07-28-2016 11:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎07-28-2016 11:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
