Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Table are empty and not in the right warehouse...

Table are empty and not in the right warehouse...

New Contributor

Hi,

After few days of battling with new HDP 3.0 to be able to get data from hiveserver2, I think I really need all the help I can get !
Here goes:
- I understand (now!) that in HDP 3.0, there are 2 'default' database, 1 in spark, 1 in hive.
- I applied the "cp /etc/hive/conf/hive-site.xml /etc/spark2/conf" move so that I can write table in the hive warehouse from spark (pyspark in zeppelin, in fact).

Based on that, the following is happening:
when I create tables from zeppelin pyspark, The tables are created in the spark warehouse AND in the hive warehouse BUT with no data:

sql_create_table = "CREATE TABLE IF NOT EXISTS car_rental (banner STRING, store STRING, start_date DATE, return_date DATE, car_sipp STRING, car_type STRING) STORED AS ORC TBLPROPERTIES ('transactional' = 'true')"

table_create = hiveContext.sql(sql_create_table)
cars.write.mode('overwrite').format("orc").saveAsTable("car_rental")

With this code, I really find a table "car_rental' in /warehouse/tablespace/managed/hive/ with all rows from the df 'cars' BUT no data are saved under the dir 'car_rental'.
However, the data is perfect in /apps/hive/warehouse.....

so now:
I'm perfectly content with the data in /apps/hive/warehouse BUT:

1. When doing a basic 'select * from car_rental' in Dataa analytics studio: no rows return (it's probably searching in the default DB of /warehouse/tablespace/managed/hive/

2. MUCH MORE IMPORTANTLY, same thing happen when querying the data externally with hiveserver2 (table is empty).

Please help.... I need to be able to access full pyspark saved tables from hiveserver2 urgently!

Don't have an account?
Coming from Hortonworks? Activate your account here