Created 10-06-2016 01:35 AM
1. I'm using a ETL tool and connecting to Spark2-HiveThriftServer2 over connection string URL: "jdbc:hive2://10.30.164.132:10000/nat"
2. I'm facing with below error, if the sql have existed wild card characters "*" : LOAD DATA LOCAL INPATH '${SOURCE_DB_FILE}/V_SOURCE_BTS/${YYYYMMDD}/s0_V_SOURCE_BTS_\*_part_\*' INTO TABLE ${SCHEMA_BI}V_SOURCE_BTS; Couldn't execute SQL: LOAD DATA LOCAL INPATH '/u02/CDR/HAITI/MakeFile/V_SOURCE_BTS/20160927/s0_V_SOURCE_BTS_\*_part_\*' INTO TABLE nat.V_SOURCE_BTS 2016/10/05 14:38:43 - V_SOURCE_BTS - 2016/10/05 14:38:43 - V_SOURCE_BTS - org.apache.spark.sql.AnalysisException: LOAD DATA input path does not exist: /u02/CDR/HAITI/MakeFile/V_SOURCE_BTS/20160927/s0_V_SOURCE_BTS_\*_part_\*;
3. The sql query can execute without error if I changed the sql and remove wild cards to: LOAD DATA LOCAL INPATH '/u02/CDR/HAITI/MakeFile/V_SOURCE_BTS/20160927/s0_V_SOURCE_BTS_20160927_part_0000000' INTO TABLE ${SCHEMA_BI}V_SOURCE_BTS;
4. The problem's happened from Spark 2.0.0.
Created 10-06-2016 03:34 AM
I've posted to spark community. It's a bug of SparkSQL.
https://issues.apache.org/jira/browse/SPARK-17796.
whenever the bug is fixed , It'll be updated to HDP 2.5. with spark 2.0.0 version.
Created 10-06-2016 03:16 AM
According to the hive documentation, "filepath can refer to a file (in which case Hive will move the file into the table) or it can be a directory (in which case Hive will move all the files within that directory into the table)."
I don't think, wild cards are allowed in the path.
Created 10-06-2016 03:34 AM
I've posted to spark community. It's a bug of SparkSQL.
https://issues.apache.org/jira/browse/SPARK-17796.
whenever the bug is fixed , It'll be updated to HDP 2.5. with spark 2.0.0 version.
Created 10-10-2016 02:23 AM