Member since
12-10-2017
2
Posts
1
Kudos Received
0
Solutions
02-23-2018
10:01 PM
"alisa houskova" - As per version 2.2.1, saveAsTable("") is not deprecated as shown in below screen shot, not sure which one did you refer.
... View more
01-14-2018
10:48 PM
1 Kudo
Versions: HDP-2.6.1 Hive 1.2.1000.2.6.1.0-129 Spark-2.1.1 Python 2.7.13 This is an issue only on a transactional hive table. In HDFS, for a transactional hive table, data file is created under a delta directory as shown below /user/acid_table/load_date=2018-01-14/delta_0018772_0018772_0000/bucket_00000 NumberFormatException thrown on delta directory. Caused by: java.util.concurrent.ExecutionException: java.lang.NumberFormatException: For input string: "0018773_0000"
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
.....
INFO PerfLogger: <PERFLOG method=OrcGetSplits from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl>
Traceback (most recent call last):
File "/home/../ex.py", line 24, in <module>
sc1.sql("select * from default.acid_table").toPandas()
File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 1585, in toPandas
File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 391, in collect
File "/usr/hdp/current/spark2-client/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
File "/usr/hdp/current/spark2-client/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o71.collectToPython.
: java.lang.RuntimeException: serious problem
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202)
Code: hiveContext = SparkSession.builder.enableHiveSupport().getOrCreate()
hiveContext.sql("select * from default.acid_table").toPandas() Everything works fine when '0000' suffix is removed from the delta directory. Please suggest.
... View more
Labels:
- Labels:
-
Apache Hive