Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Will 2 spark application cause any in consistency

Highlighted

Will 2 spark application cause any in consistency

New Contributor

I have 2 spark application. One is writing the data through Hive Metastore as below:


df.write
  .option("path", "adl:///test-data/hive_tables")
  .mode(SaveMode.Append)
  .format("json")
  .partitionBy("col1")
  .saveAsTable("sample")


While other is reading the data from same table as below:

spark.read.table("sample")


If both jobs are running in parallel, is there any possibility that the data read by second application will be inconsistent? If so, how can I avoid it


2 REPLIES 2
Highlighted

Re: Will 2 spark application cause any in consistency

Community Manager

The above was originally posted in the Community Help Track. On Wed May 22 16:45 UTC 2019, a member of the HCC moderation staff moved it to the Data Processing track. The Community Help Track is intended for questions about using the HCC site itself.

Bill Brooks, Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Re: Will 2 spark application cause any in consistency

New Contributor

@Jacek Dobrowolski: Any suggestion on this?

Don't have an account?
Coming from Hortonworks? Activate your account here