07-20-2017 12:09 PM
when i am trying to run example given in https://github.com/cloudera-labs/envelope/tree/master/examples/filesystem with the following command
spark2-submit --packages com.databricks:spark-csv_2.10:1.5.0 envelope/target/envelope-0.4.0.jar envelope/examples/filesystem/filesystem.conf
it throws following error
17/07/20 13:49:24 INFO codegen.CodeGenerator: Code generated in 293.873193 ms
17/07/20 13:49:25 INFO execution.SparkSqlParser: Parsing command: fsInput
17/07/20 13:49:25 INFO execution.SparkSqlParser: Parsing command: SELECT foo FROM fsInput
17/07/20 13:49:25 INFO execution.SparkSqlParser: Parsing command: fsProcess
Exception in thread "main" java.util.concurrent.ExecutionException: java.lang.RuntimeException: Filesystem output does not support file format: avro
07-20-2017 12:32 PM - edited 07-20-2017 12:49 PM
Thanks for pointing that out.
Unfortunately Cloudera's Spark 2.1 does not contain the Avro integration that is in Cloudera's Spark 1.6, and so the filesystem example stopped working in the latest Envelope release (where we rebased on 2.1). We will get that example corrected.
In the mean time you could run a similar example by changing the output's 'format' to 'parquet'.