Support Questions
Find answers, ask questions, and share your expertise

Complex Python Dictionary Object to Spark Dataframes

New Contributor

Please suggest optimum way to process a complex python dictionary object into Spark. Python Dictionary has heterogeneous elements (Str, 1D Array, 2D Array, 3D Array).

Tried to read all elements in pandas DF and then create a big DF. But for large number of elements consolidation of small DFs is not possible as facing memory error.

Also what will be the suitable format to store this data in HDFS in structured manner.