Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive table not accessible in Spark 2 sql in HDP

Hive table not accessible in Spark 2 sql in HDP

New Contributor

I am running following job in HDP.

export SPARK-MAJOR-VERSION=2

spark-submit --class com.spark.sparkexamples.Audit --master yarn --deploy-mode cluster \ --files /bigdata/datalake/app/config/metadata.csv BRNSAUDIT_v4.jar dl_raw.ACC /bigdatahdfs/landing/AUDIT/BW/2017/02/27/ACC_hash_total_and_count_20170227.dat TH 20170227

Its failing with error that: *Table or view not found: `dl_raw`.`ACC`; line 1 pos 94; 'Aggregate [count(1) AS rec_cnt#58L, 'count('BRCH_NUM) AS hashcount#59, 'sum('ACC_NUM) AS hashsum#60] +- 'Filter (('trim('country_code) = trim(TH)) && ('from_unixtime('unix_timestamp('substr('bus_date, 0, 11), MM/dd/yyyy), yyyyMMdd) = 20170227)) +- 'UnresolvedRelation `dl_raw`.`ACC'*

Whereas table is present in Hive and it is accessible from spark-shell.

High level code -

val sparkSession = SparkSession.builder .appName("spark session example") .enableHiveSupport() .getOrCreate()

sparkSession.conf.set("spark.sql.crossJoin.enabled", "true")

val df_table_stats = sparkSession.sql("""select count(*) as rec_cnt,count(distinct BRCH_NUM) as hashcount, sum(ACC_NUM) as hashsum from dl_raw.ACC where trim(country_code) = trim('BW') and from_unixtime(unix_timestamp(substr(bus_date,0,11),'MM/dd/yyyy'),'yyyyMMdd')='20170227'""")

Same code was running fine in CDH platform.

Shocking this is that same table is accessible in spark-shell.

I tried passing hive-site.xml using --file option still not working.

1 REPLY 1
Highlighted

Re: Hive table not accessible in Spark 2 sql in HDP

I am also facing the same issue. https://issues.apache.org/jira/browse/SPARK-15345

Don't have an account?
Coming from Hortonworks? Activate your account here