About papil_patil15

falbani · ‎07-19-2018

@Papil Patil cache function is lazy, so in order to see the data cached you should actually perform an action that would trigger the execution of the dag. For example: df = spark.read .format("jdbc")\ .option("url","---------------------------")\ .option("driver","com.sap.db.jdbc.Driver") .option("CharSet","iso_1")\ .option("user","---------------------------")\ .option("password", "---------------------------")\ .option("dbtable","(select * from schema.table_name ) tmp ")\ .load() df.cache() //this will trigger the dag and you should see data cache val count = df.count() //next time it will just use the data in cache so it should be faster to execute val count2 = df.count() HTH *** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

Online	Offline
Last Visited	‎07-09-2019 11:45 AM

Member Since	‎11-29-2017 02:32 PM
Last Visited	‎07-09-2019 11:45 AM
Posts	2

Cloudera Community

Re: df.cache() is not working on jdbc table