Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Will 2 spark application cause any in consistency

Highlighted

Will 2 spark application cause any in consistency

New Contributor

I have 2 spark application. One is writing the data through Hive Metastore as below:


df.write
  .option("path", "adl:///test-data/hive_tables")
  .mode(SaveMode.Append)
  .format("json")
  .partitionBy("col1")
  .saveAsTable("sample")


While other is reading the data from same table as below:

spark.read.table("sample")


If both jobs are running in parallel, is there any possibility that the data read by second application will be inconsistent? If so, how can I avoid it


2 REPLIES 2

Re: Will 2 spark application cause any in consistency

Community Manager

The above was originally posted in the Community Help Track. On Wed May 22 16:45 UTC 2019, a member of the HCC moderation staff moved it to the Data Processing track. The Community Help Track is intended for questions about using the HCC site itself.

Re: Will 2 spark application cause any in consistency

New Contributor

@Jacek Dobrowolski: Any suggestion on this?