Member since
11-02-2016
2
Posts
0
Kudos Received
0
Solutions
11-07-2016
07:49 AM
Still have the problem on CDH-5.7.
... View more
11-02-2016
12:36 PM
Hi everybody, I experience the same issue on CDH5.5 (Spark 1.6.0) with my Spark Streaming Job. Data is read from a Kafka broker and then inserted into an hive table, paritionning by year/month/day/hour. All the data is present into the table after the insetinto() call but 'hive-staging....' directory created during the batch is still there and empty ... The resources are allocated by Yarn, there are no errors logs about file creation/deletion in the executors logs. I had tested a lot of settings without any success (regarding logs persistence etc.). Micro-batch is called every 10 seconds... The job will produce a lot of useless empty directories.
... View more