After upgrading to CDH 5.4, we are unable to run imapal-shell from within a shell script.
Here is the script:
The Python egg cache directory is currently set to: /tmp/impala-shell-python-egg-cache-myuser Perhaps your account does not have write access to this directory? You can change the cache directory by setting the PYTHON_EGG_CACHE environment variable to point to an accessible directory.
Is there a change in the way the env variable need to be set?
Can't see anything wrong with your query script right away, but have a suggestion how I ended coding similar tasks.
Rather than use shell action, which is slow and consumes slot due to being MR task, I have all shell commands being executed by ssh, with following advantages:
- Much faster
- No MR slot used
- All user and environment hurdles can be solved and tested by simpling ssh'ing yourself to the account and running your command
- In my case use extensively Anaconda python distribution, and can have it sandboxed to ssh user and not depend on yarn account
- Hue doesn't display logs : they are saved on your users directory
- No "load-balancing" : you have to name the server to run the command
You have to pre-configure the certificates on the machines, but simple after that...