Support Questions
Find answers, ask questions, and share your expertise

how to combine two dstream using pyspark?

how to combine two dstream using pyspark?

New Contributor

I have this problem on pyspark which is this:

I should combine two dstream into one dstream but unfortunately I have not received any print.

This is my code:


sc = SparkContext(appName="Sparkstreaming")
spark = SparkSession(sc)

ssc = StreamingContext(sc,3)

kafka_stream = KafkaUtils.createStream(ssc,"localhost:2181","consumer-be-borsa",{"Be_borsa":1})
kafka_stream1 = KafkaUtils.createStream(ssc,"localhost:2181","consumer-be-teleborsa",{"Be_teleborsa":1})

dstream = k, v: json.loads(v['id']))
dstream1 = k, v : json.loads(v['id']))

# Join
streamJoined = dstream.join(dstream1)





These are the two JSON files that I consume:


{"id": "Be_20200330", "Date": "2020-03-30", "Name": "Be", "Hour": "15.49.24", "Last Price": "0,862", "Var%": "-1,93", "Last Value": "1.020"}

{"id": "Be_20200330", "Date": "2020-03-30", "Name": "Be", "Volatility": "2,352", "Value_at_risk": "5,471"}


The result I would like is this:


{"id": "Be_20200330", "Date": "2020-03-30", "Name": "Be","Hour": "15.49.24", "Last Price": "0,862", "Var%": "-1,93", "Last Value": "1.020", "Volatility": "2,352", "Value_at_risk": "5,471"}


How can i do using pyspark?