Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

dataframe to table/hdfs


dataframe to table/hdfs

New Contributor

Hi All,


I have a scenario where I need to write data from dataframe to two internal tables. 


I have the code as below and it works.




df  = hc.sql("insert into Table2 select * from Table1")

However, when I change the code as below to avoid disk to disk and recomputation. It fails with either Executor not found (or) container not found (or) block not found.



I have seen suggestions online to change number of cores/memory/repartition etc. But, can you please let me know how to make spark fail proof irrespective of volume of data. I have selected MEMORY_AND_DISK specifically so that it does not cause memory issue. 



Re: dataframe to table/hdfs

New Contributor

This worked for less volume. Issue occurs for huge volume

Re: dataframe to table/hdfs

Expert Contributor

It may help to add a stack trace with more detail.  It sounds like your executors may be having errors with memory though, either a Java OOM or Yarn killing because of memory exceeded.  If this is the case you will have to play with the memory settings.