Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Not able to view inserted data using spark in hive tables

avatar
Contributor

I am using sandbox image and trying to insert data into hive using spark

Following steps I have done

1.loading hdfs csv file and created dataframe

2.register the temp table on dataframe

3.create table in hive using hivecontext

4.insert data from temp table into new created hive table

Issue: code running without any error but when trying to view insert data into hive then it showing no records

when executing query(select * from newtable) using hivecontext then dataframe contain data.

ps- My code is running on windows server and connecting sandbox environment then facing issue but when I am running my code on MacBook with sandbox image and execute the same code then hive table showing the data.

I am not able to figure out what can be cause.

Please suggest.

Thanks in advance

1 ACCEPTED SOLUTION

avatar
Master Guru

@payal patel Do you have any tables in hive? can you do a quick test and check if you are able to query those tables? if so can you try and insert into one of those?

View solution in original post

6 REPLIES 6

avatar

I know the tutorial at http://hortonworks.com/hadoop-tutorial/hello-world-an-introduction-to-hadoop-hcatalog-hive-and-pig/#... does this slightly different as they have the Spark code save a DF as an ORC file (step 4.5.2) and then they run a Hive LOAD command (step 4.5.3), but your INSERT-AS-SELECT model sounds like it should work. Maybe someone could test it out if you supplied a simple working example of what you have so they could try to resolve the issue for you.

avatar
Master Guru

@payal patel Do you have any tables in hive? can you do a quick test and check if you are able to query those tables? if so can you try and insert into one of those?

avatar
Contributor

Thanks. I have hive table and able to view data.

I found two issues and resolved one issue

-host file contain different hostname against ip and my code was pointing directly using ipaddress. corrected ipaddress in host file

-remove partitionBy statement from create table statement.

Now looking into why partitionBy causing issue.

Thanks a lot

avatar
Master Guru

@payal patel please add this to your script and try once again.

hiveContext.setConf("hive.exec.dynamic.partition", "true")
hiveContext.setConf("hive.exec.dynamic.partition.mode", "nonstrict")

avatar
Contributor

Thanks its working fine now..able to view partitioned data in hive

avatar
Contributor

Thanks..I'll lookinto it