Created on 10-05-2016 11:38 AM - edited 09-16-2022 08:42 AM
We are using Spark-sql and Parquet data-format. Avro is used as the schema format. We are trying to use “aliases” on field names and are running into issues while trying to use alias-name in SELECT.
Sample schema, where each field has both a name and a alias:
{ "namespace": "com.test.profile", "type": "record", "name": "profile", "fields": [ {"name": "ID", "type": "string"}, {"name": “F1", "type": ["null","int"], "default": "null", "aliases": [“F1_ALIAS"]}, {"name": “F2", "type": ["null","int"], "default": "null", "aliases": [“F2_ALIAS"]} ] }
Code for SELECT:
val profile = sqlContext.read.parquet(“/user/test/parquet_files/*”) profile.registerTempTable(“profile") val features = sqlContext.sql("“SELECT F1_ALIAS from profile”)
It will throw the following exception:
org.apache.spark.sql.AnalysisException: cannot resolve ‘`F1_ALIAS`' given input columns: [ID, F1, F2]
Any suggestions for this use case?
On a side note, what characters are allowed in aliases? e.g. is "!" allowed?
Thank you in advance!