All of the answers here were for Hive, not Impala. Impala architecturally handles things very differently than Hive, and does not leverage YARN for its execution, nor does it store its spill/intermediate data on HDFS. The local storage paths for YARN is used for job-intermediate (not query-intermediate, which is of a higher level) data. For example, for a map phase to sort its data, and to send its data to the reduce phase, for the reducer to merge incoming map data and to sort it, etc.. The storage is meant for a container's transient data that does not need to persist beyond the life of the job. Query stage and final results need to be persisted for finite times, so they go on HDFS until automatically cleaned up by the query execution logic. On the topic of replication, yes the 'temporary' HDFS data from inter-stage phases of queries may be replicated, but is cleared up once the query reaches any completion state. Final query results are also deleted after the results are extracted, or when query/session is marked closed by the user/app issuing it. FWIW you could use higher RAM to achieve a lower disk cost. Hive with MR is very disk-oriented because each stage is its own job and the jobs use local storage when running, but Hive with Spark may use much lower disk space during a query due to better stage transitions (all within the same app, instead of separate apps). Likewise, Impala uses up RAM, but does not impact your local disk storage unless it finds inadequate memory to execute the query. Hope this helps.
... View more
Hello @Nakul, Please post the full stack trace so we can do a better job of understanding the context. This error indicates that your Hue code may have been upgraded but the Hue database was not upgraded to add the "is_managed" column. It is likely that when south migration is failing and we should check to see why. Try restarting Hue Service then: - Click the Instances sub-tab - Click on the Hue Server link to view the Hue Server page. - Click on the "Log Files" drop-down and choose "stderr" Review the bottom of the log for any stack traces that occur just after you see a line with the following: lib/hue/build/env/bin/hue migrate --merge If you do see any error messages or stack traces, share them with us.
... View more