Member since
08-16-2016
642
Posts
131
Kudos Received
68
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3976 | 10-13-2017 09:42 PM | |
| 7474 | 09-14-2017 11:15 AM | |
| 3798 | 09-13-2017 10:35 PM | |
| 6033 | 09-13-2017 10:25 PM | |
| 6601 | 09-13-2017 10:05 PM |
06-16-2017
11:09 AM
The semicolon is needed after each HQL statement not at the end of the script name. Remove it and try again. Please share the contents of the script as well. If I recall correctly, if you don't use 'exit' in the HQL script it will dump you into the Hive shell. Without knowing the rest of the script I'd say it is behaving as expected.
... View more
06-16-2017
11:06 AM
Try taking off the './'. Distributing to the executors will place it in the working directory of each one and can be referenced as such. The annotation of './' means to run a file not read a file.
... View more
06-16-2017
10:16 AM
A way to think about it is "How will it know the location value?" It won't. You must tell it what the value is and per row so it can partition it dynamically as it is loading the data. If the location value is the same for all of the data in the DF, then you may be better served by loading it statically. In which case, create the subfolder for the location value under the table's path and then write the DF out to that location.
... View more
06-16-2017
10:14 AM
Your case class must include it so that it maps to the DF correctly. The error is because it is now looking for the location columns in the DF and it doesn't exist. Make the change to your class and DF; then it should be good.
... View more
06-16-2017
10:12 AM
I don't know of anything that would allow it to read the configuration files from the other file. The difference in mapred and mapreduce settings is the MR API. I think it is possible then that the MR app you have is using the older API and maybe that is why the mapred settings were working before. You can check the configuration settings for each MR job through the RM UI. Use that to verify the exact settings used in each settings. On the resource usage, the number of maps is determined by the input. Are you positive that the same amount of data and blocks was being used by the app in each run?
... View more
06-15-2017
02:16 PM
I would track down the logs for container container_e14_14XXXXXXXXXXX_XXXXX_01_000001. That should contain more details on the actual error.
... View more
06-15-2017
02:12 PM
Check the logs for the Jhist role. The stderr log should have the exception or error that caused it to fail to start. The issue with the jobs is that on the worker nodes the yarn.nodemanager.local-dirs directories (it can be more than one) do not have enough space. Check your config, and the check the space on those directories on the worker nodes.
... View more
06-15-2017
02:08 PM
My issue was that I didn't have that placement rule in place. I created the placement rule to allow it to be specified at runtime and then my queries were assigned to the correct queue. Is that rule above (have a lower number) than the default pool rule? The other item to check is that the user you are running the queries from has access to submit to the pool.
... View more
06-15-2017
01:32 PM
Make sure that you have the placement rule to allow pools to be specified at runtime. This got me one time and I kept getting the messages that it was set but it would continue to run in the default queue. As for running it with -q. I haven't tried it but I imaged it would be similar to hive. impala-shell -i xxxx -q request_pool=new_pool; select...
... View more
06-15-2017
12:26 PM
1 Kudo
Try adding the saslQop config to the connection configuration. The actual value will need to match your clusters Hive configuration. hive.connect('localhost', configuration={'hive.server2.thrift.sasl.qop': 'auth-conf})
... View more