Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Who agreed with this topic

Spark-sql fails to use "SELECT" on Aliases on Parquet files (as defined in Avro schema)

New Contributor

We are using Spark-sql and Parquet data-format. Avro is used as the schema format. We are trying to use “aliases” on field names and are running into issues while trying to use alias-name in SELECT.


Sample schema, where each field has both a name and a alias:


{ "namespace": "com.test.profile",
  "type": "record",
  "name": "profile",
  "fields": [
    {"name": "ID", "type": "string"},
    {"name": “F1", "type": ["null","int"], "default": "null", "aliases": [“F1_ALIAS"]},
    {"name": “F2", "type": ["null","int"], "default": "null", "aliases": [“F2_ALIAS"]}

Code for SELECT:


val profile =“/user/test/parquet_files/*”)
val features = sqlContext.sql("“SELECT F1_ALIAS from profile”)


It will throw the following exception:


org.apache.spark.sql.AnalysisException: cannot resolve ‘`F1_ALIAS`' given input columns: [ID, F1, F2]


Any suggestions for this use case? 


On a side note, what characters are allowed in aliases? e.g. is "!" allowed?


Thank you in advance!

Who agreed with this topic