Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to read a Large Avro file in Spark Data frame. The avro having at-least 6 levels of nested structure.

How to read a Large Avro file in Spark Data frame. The avro having at-least 6 levels of nested structure.

New Contributor

I am trying to read a large avro file (2GB) using spark-shell but I am getting stackoverflow error. I tried to increase driver memory and executor memory but I am still getting same error. How can I read this file ? Is theere a way to partition this file?

val newDataDF = spark.read.format("com.databricks.spark.avro").load("abc.avro")

java.lang.StackOverflowError
  at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:71)
  at com.databricks.spark.avro.SchemaConverters$.toSqlType(SchemaConverters.scala:81)