Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Complex Python Dictionary Object to Spark Dataframes

Highlighted

Complex Python Dictionary Object to Spark Dataframes

New Contributor

Please suggest optimum way to process a complex python dictionary object into Spark. Python Dictionary has heterogeneous elements (Str, 1D Array, 2D Array, 3D Array).

Tried to read all elements in pandas DF and then create a big DF. But for large number of elements consolidation of small DFs is not possible as facing memory error.

Also what will be the suitable format to store this data in HDFS in structured manner.

Don't have an account?
Coming from Hortonworks? Activate your account here