Support Questions

Find answers, ask questions, and share your expertise

In hdp 3.0 can't create hive table in spark failed

avatar
Contributor

Hi ,just doing some testing on newly posted hdp 3.0. and the example failed . I tested same script on previous HDP platform, works fine. can someone advice it's hive's new feature or anything I have done wrong?

./bin/spark-submit examples/src/main/python/sql/hive.py

Hive Session ID = bf71304b-3435-46d5-93a9-09ef752b6c22 AnalysisExceptionTraceback (most recent call last) /usr/hdp/3.0.0.0-1634/spark2/examples/src/main/python/sql/hive.py in <module>() 44 45 # spark is an existing SparkSession 46 spark.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING) USING hive") 47 spark.sql("LOAD DATA LOCAL INPATH 'examples/src/main/resources/kv1.txt' INTO TABLE src") 48 /usr/hdp/3.0.0.0-1634/spark2/python/lib/pyspark.zip/pyspark/sql/session.py in sql(self, sqlQuery) 714 [Row(f1=1, f2=u'row1'), Row(f1=2, f2=u'row2'), Row(f1=3, f2=u'row3')] 715 """ 716 return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped) 717 718 @since(2.0) /usr/hdp/3.0.0.0-1634/spark2/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py in __call__(self, *args) 1255 answer = self.gateway_client.send_command(command) 1256 return_value = get_return_value( -> 1257 answer, self.gateway_client, self.target_id, self.name) 1258 1259 for temp_arg in temp_args: /usr/hdp/3.0.0.0-1634/spark2/python/lib/pyspark.zip/pyspark/sql/utils.py in deco(*a, **kw) 67 e.java_exception.getStackTrace())) 68 if s.startswith('org.apache.spark.sql.AnalysisException: '): 69 raise AnalysisException(s.split(': ', 1)[1], stackTrace) 70 if s.startswith('org.apache.spark.sql.catalyst.analysis'): 71 raise AnalysisException(s.split(': ', 1)[1], stackTrace)

AnalysisException: u'org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Table default.src failed strict managed table checks due to the following reason: 
	Table is marked as a managed table but is not transactional.);'

much appreciated!

1 ACCEPTED SOLUTION

avatar
Contributor

hi Aditya,

Thank you for the response .

The issue was related to when using spark to write to hive ,now have to provide the table format as below

df.write.format("orc").mode("overwrite").saveAsTable("tt") # this run good
df.write.mode("overwrite").saveAsTable("tt")               # this command will fail 

I didn't change anything on hive tab after hdp 3.0 installed .

View solution in original post

5 REPLIES 5

avatar
Super Guru

@dalin qin,

In HDP 3.0, all managed tables should be transactional. Please enable ACID and make the table as transactional and try again.

You can read more about ACID here

https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/managing-hive/content/hive_acid_operations....

.

Please "Accept" the answer if this helps.

.

-Aditya

avatar
Contributor

hi Aditya,

Thank you for the response .

The issue was related to when using spark to write to hive ,now have to provide the table format as below

df.write.format("orc").mode("overwrite").saveAsTable("tt") # this run good
df.write.mode("overwrite").saveAsTable("tt")               # this command will fail 

I didn't change anything on hive tab after hdp 3.0 installed .

avatar
Contributor
df.write.format("orc").mode("overwrite").saveAsTable("database.table-name")

When I create a Hive table through Spark, I am able to query the table from Spark but having issue while accessing table data through Hive. I get below error.

Error: java.io.IOException: java.lang.IllegalArgumentException: bucketId out of range: -1 (state=,code=0)

I am able to view table metadata.

avatar

Hi Sharma,

Any update for your above issue, because I am also facing the same issue

Error: java.io.IOException: java.lang.IllegalArgumentException: bucketId out of range: -1 (state=,code=0)


avatar

Hi

I faced the same issue after setting the following properties, it is working fine.

set hive.mapred.mode=nonstrict;

set hive.optimize.ppd=true;

set hive.optimize.index.filter=true;

set hive.tez.bucket.pruning=true;

set hive.explain.user=false;

set hive.fetch.task.conversion=none;

set hive.support.concurrency=true;

set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;