Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

cdsw spark context issue

Solved Go to solution

cdsw spark context issue

New Contributor

Hi, I am trying to start a spark session via CDSW and met an error showed as below: TypeError: __init__() got an unexpected keyword argument 'auth_token' codes I used: from pyspark import SparkContext from pyspark import SparkConf from pyspark.sql import HiveContext from pyspark.sql import SQLContext conf = SparkConf().set("spark.executor.memory", "12g") \ .set("spark.yarn.executor.memoryOverhead", "3g") \ .set("spark.dynamicAllocation.initialExecutors", "2") \ .set("spark.driver.memory", "16g") \ .set("spark.kryoserializer.buffer.max", "1g") \ .set("spark.driver.cores", "32") \ .set("spark.executor.cores", "8") \ .set("spark.yarn.queue", "us9") \ .set("spark.dynamicAllocation.maxExecutors", "32") sparkContext = SparkContext.getOrCreate(conf=conf) Does anyone meet this error before or know about how to solve it? Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: cdsw spark context issue

Rising Star

Hi,

 

This is a known issue for the CDSW 1.3 release, please read the documentation about this:

https://www.cloudera.com/documentation/data-science-workbench/1-3-x/topics/cdsw_known_issues.html#cd...

 

I also see that you are trying to create a SparkContext object which still should work but you might be better off using the new Spark 2.x interfaces. You can see a few examples here:

https://www.cloudera.com/documentation/data-science-workbench/1-3-x/topics/cdsw_pyspark.html

 

Regards,

Peter

3 REPLIES 3
Highlighted

Re: cdsw spark context issue

Rising Star

Hi,

 

This is a known issue for the CDSW 1.3 release, please read the documentation about this:

https://www.cloudera.com/documentation/data-science-workbench/1-3-x/topics/cdsw_known_issues.html#cd...

 

I also see that you are trying to create a SparkContext object which still should work but you might be better off using the new Spark 2.x interfaces. You can see a few examples here:

https://www.cloudera.com/documentation/data-science-workbench/1-3-x/topics/cdsw_pyspark.html

 

Regards,

Peter

Re: cdsw spark context issue

New Contributor

Thank you so much! My problem has been solved.

Re: cdsw spark context issue

Explorer

@peter.ableda Sorry to ask you. Actually I have installed cdsw 1.4 on my cdsw machine and when I am trying to start the sparksession/running any hdfs commands then I am getting the error as unknowhostException with the  clouderamaster hostname. I am very new to cloudera so not sure which set up i am missing as i followed the set up related to pyspark(by importing the template while creating the project and starting the pyhton 2 env to run the pyspark job). It would be great help if you can guide me something which I am missing from my set up. Thanks in Advance!!!