Created 08-28-2019 03:50 PM
Am using spark 2.3.0 and python 2.7.13. Am trying to build a streaming pipeline but am getting error while trying to createStream. below is the code snippet am using . Am trying it pyspark2 --master="local[*]". I have the jars in my local which am setting using config option (read online that kinesis asl & spark should have compatible jar versions:
from __future__ import print_function
import sys
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
from pyspark.storagelevel import StorageLevel
from pyspark.streaming import DStream
from pyspark.streaming import StreamingContext
from pyspark.streaming.kinesis import KinesisUtils, InitialPositionInStream
spark = SparkSession.builder.appName("TestStreaming").config("spark.jars","spark-streaming-kinesis-asl_2.11-2.3.0.jar")\
.config("spark.jars","spark-streaming-kinesis-asl-assembly_2.11-2.0.0.jar")\
.config("spark.jars","aws-java-sdk-1.11.310.jar")\
.config("spark.jars","amazon-kinesis-client-1.9.0.jar")\
.getOrCreate()
ssc = StreamingContext(spark, 2)
lines = KinesisUtils.createStream(ssc, "TestStreaming", "cwlog-kinesis-stream-SDLC-catalina", "https://kinesis.us-east-1.amazonaws.com", "us-east-1", InitialPositionInStream.LATEST, 2, StorageLevel.MEMORY_AND_DISK_2)
Am getting below on executing "lines":
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/cloudera/parcels/SPARK2-2.3.0.cloudera3-1.cdh5.13.3.p0.458809/lib/spark2/python/pyspark/streaming/kinesis.py", line 79, in createStream jlevel = ssc._sc._getJavaStorageLevel(storageLevel) AttributeError: 'SparkSession' object has no attribute '_getJavaStorageLevel'