Member since
10-05-2016
1
Post
0
Kudos Received
0
Solutions
10-05-2016
11:38 AM
We are using Spark-sql and Parquet data-format. Avro is used as the schema format. We are trying to use “aliases” on field names and are running into issues while trying to use alias-name in SELECT. Sample schema, where each field has both a name and a alias: { "namespace": "com.test.profile",
"type": "record",
"name": "profile",
"fields": [
{"name": "ID", "type": "string"},
{"name": “F1", "type": ["null","int"], "default": "null", "aliases": [“F1_ALIAS"]},
{"name": “F2", "type": ["null","int"], "default": "null", "aliases": [“F2_ALIAS"]}
]
} Code for SELECT: val profile = sqlContext.read.parquet(“/user/test/parquet_files/*”)
profile.registerTempTable(“profile")
val features = sqlContext.sql("“SELECT F1_ALIAS from profile”) It will throw the following exception: org.apache.spark.sql.AnalysisException: cannot resolve ‘`F1_ALIAS`' given input columns: [ID, F1, F2] Any suggestions for this use case? On a side note, what characters are allowed in aliases? e.g. is "!" allowed? Thank you in advance!
... View more
Labels:
- Labels:
-
Apache Spark