Member since
01-27-2016
2
Posts
2
Kudos Received
0
Solutions
02-02-2016
08:54 PM
@Artem Ervits yes still having the issue. I have moved on to other things though. What is the correct response in that scenario?
... View more
01-27-2016
11:30 AM
2 Kudos
Using the sandbox I have saved a parquet file as a table with: df.write.format('parquet').mode('overwrite').saveAsTable(myfile) followed by: sqlContext.refreshTable(myfile) when I attempt to query the file with SparkSQL or Hive I get the error: {"message":"H170 Unable to fetch results. java.io.IOException: java.io.IOException: hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/myfile/part-r-00000-5dc24bf0-23ef-4f3c-a1fc-42928761592d.gz.parquet not a SequenceFile [ERROR_STATUS]","status":500,"trace":"org.apache.ambari.view.hive.client.HiveErrorStatusException: H170 Unable to fetch results. java.io.IOException: java.io.IOException: .... This issue started after I had first replaced the parquet file underlying the original df and attempted to rebuild. When I run df.head(10) I can see the dataframe. I have attempted manually deleting the parquet and the Hive files under the warehouse, even after they are deleted when I resave the table the issue occurs. I have sqlContext.setConf("spark.sql.hive.convertMetastoreParquet", "false") I have tried os.environ["HADOOP_USER_NAME"] = 'hdfs' I have tried unpersisting the dataframe I have tried changing the permissions with os.system('hdfs fs -chmod -R 777 hdfs://apps/hive/warehouse') I can't seem to clear out this issue. I have seen resolutions with the above but none have helped me. I can't seem to get back to being able to access the data via Hive or SparkSQL.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark