Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

AttributeError in Spark 2.3

Solved Go to solution
Highlighted

AttributeError in Spark 2.3

New Contributor

Hi,

The below code is not working in Spark 2.3 , but its working in 1.7.

Can someone modify the code as per Spark 2.3

import os

from pyspark import SparkConf,SparkContext

from pyspark.sql import HiveContext

conf = (SparkConf() .setAppName("data_import") .set("spark.dynamicAllocation.enabled","true") .set("spark.shuffle.service.enabled","true"))

sc = SparkContext(conf = conf)

sqlctx = HiveContext(sc)

df = sqlctx.load( source="jdbc", url="jdbc:sqlserver://10.24.40.29;database=CORE;username=user1;password=Passw0rd", dbtable="test")

## this is how to write to an ORC file df.write.format("orc").save("/tmp/orc_query_output")

## this is how to write to a hive table df.write.mode('overwrite').format('orc').saveAsTable("test")

Error : AttributeError: 'HiveContext' object has no attribute 'load'

1 ACCEPTED SOLUTION

Accepted Solutions

Re: AttributeError in Spark 2.3

@Debananda Sahoo

In spark 2 you should leverage spark session instead of spark context. To read jdbc datasource just use the following code:

from pyspark.sql import SparkSession
from pyspark.sql import Row

spark = SparkSession \
    .builder \
    .appName("data_import") \
    .config("spark.dynamicAllocation.enabled", "true") \
    .config("spark.shuffle.service.enabled", "true") \
    .enableHiveSupport() \
    .getOrCreate()

jdbcDF2 = spark.read \
    .jdbc("jdbc:sqlserver://10.24.40.29;database=CORE;username=user1;password=Passw0rd", "test")

More information and examples on this link:

https://spark.apache.org/docs/2.1.0/sql-programming-guide.html#jdbc-to-other-databases

Please let me know if that works for you.

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

2 REPLIES 2

Re: AttributeError in Spark 2.3

@Debananda Sahoo

In spark 2 you should leverage spark session instead of spark context. To read jdbc datasource just use the following code:

from pyspark.sql import SparkSession
from pyspark.sql import Row

spark = SparkSession \
    .builder \
    .appName("data_import") \
    .config("spark.dynamicAllocation.enabled", "true") \
    .config("spark.shuffle.service.enabled", "true") \
    .enableHiveSupport() \
    .getOrCreate()

jdbcDF2 = spark.read \
    .jdbc("jdbc:sqlserver://10.24.40.29;database=CORE;username=user1;password=Passw0rd", "test")

More information and examples on this link:

https://spark.apache.org/docs/2.1.0/sql-programming-guide.html#jdbc-to-other-databases

Please let me know if that works for you.

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

Re: AttributeError in Spark 2.3

New Contributor

Thanks Felix for your quick response. It worked. Thanks a lot.

Don't have an account?
Coming from Hortonworks? Activate your account here