- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Issue running Impala Shell Commands with Oozie
- Labels:
-
Apache Impala
-
Apache Oozie
-
Kerberos
Created on ‎01-29-2018 06:47 AM - edited ‎09-16-2022 05:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
When I run Impala shell commands with Oozie workflow based on the steps recommended here:
http://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Impala-schedule-with-oozie-tutorial/td-...
https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/How-to-Schedule-Impala-Jobs-with-Oozie...
I am getting an error thrown:
Traceback (most recent call last): File "/app/localstorage/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/bin/../lib/impala-shell/impala_shell.py", line 38, in <module> from impala_client import (ImpalaClient, DisconnectedException, QueryStateException, File "/app/localstorage/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/impala-shell/lib/impala_client.py", line 20, in < module> import sasl File "/app/localstorage/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/impala-shell/ext-py/sasl-0.1.1-py2.7-linux-x86_64 .egg/sasl/__init__.py", line 1, in <module> File "/app/localstorage/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/impala-shell/ext-py/sasl-0.1.1-py2.7-linux-x86_64 .egg/sasl/saslwrapper.py", line 7, in <module> File "/app/localstorage/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/impala-shell/ext-py/sasl-0.1.1-py2.7-linux-x86_64 .egg/_saslwrapper.py", line 7, in <module> File "/app/localstorage/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/impala-shell/ext-py/sasl-0.1.1-py2.7-linux-x86_64 .egg/_saslwrapper.py", line 6, in __bootstrap__ ImportError: /tmp/impala-shell-python-egg-cache-subuser/sasl-0.1.1-py2.7-linux-x86_64.egg-tmp/_saslwrapper.so: fail ed to map segment from shared object: Operation not permitted
We are using kerberos and this folder has 777 permission, but still throwing this error.
How do we resolve this?
Created ‎02-01-2018 07:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Problem Solved
The issue was that the script /bin/impala_shell is hardcoded by cloudera on our nodes, and so the PYTHON_EGG_CACHE always got redefined to /tmp
Below is a snippet of the code in /bin/impala_shell:
# We should set the EGG_CACHE to a per-user temporary location. # This follows what hue does. PYTHON_EGG_CACHE=/tmp/impala-shell-python-egg-cache-${USER} if [ ! -d ${PYTHON_EGG_CACHE} ]; then mkdir ${PYTHON_EGG_CACHE} fi
Additionally, /tmp was not exec mounted on all nodes.
In order to solve the issue we were having, we added the below code to the impala shell script that we want to run with oozie:
export PYTHON_EGG_CACHE=/app/bds export link_folder=/tmp/impala-shell-python-egg-cache-$(whoami) if ! [ -L $link_folder ] then rm -Rf "$link_folder" ln -sfn ${PYTHON_EGG_CACHE}${link_folder} ${link_folder} fi mkdir -p ${PYTHON_EGG_CACHE}${link_folder}
This creates a new link dir for PYTHON_EGG_CACHE, on a shared folder, which can be accessed by all nodes.
Created ‎02-01-2018 07:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Problem Solved
The issue was that the script /bin/impala_shell is hardcoded by cloudera on our nodes, and so the PYTHON_EGG_CACHE always got redefined to /tmp
Below is a snippet of the code in /bin/impala_shell:
# We should set the EGG_CACHE to a per-user temporary location. # This follows what hue does. PYTHON_EGG_CACHE=/tmp/impala-shell-python-egg-cache-${USER} if [ ! -d ${PYTHON_EGG_CACHE} ]; then mkdir ${PYTHON_EGG_CACHE} fi
Additionally, /tmp was not exec mounted on all nodes.
In order to solve the issue we were having, we added the below code to the impala shell script that we want to run with oozie:
export PYTHON_EGG_CACHE=/app/bds export link_folder=/tmp/impala-shell-python-egg-cache-$(whoami) if ! [ -L $link_folder ] then rm -Rf "$link_folder" ln -sfn ${PYTHON_EGG_CACHE}${link_folder} ${link_folder} fi mkdir -p ${PYTHON_EGG_CACHE}${link_folder}
This creates a new link dir for PYTHON_EGG_CACHE, on a shared folder, which can be accessed by all nodes.
