Member since
09-03-2015
5
Posts
0
Kudos Received
0
Solutions
07-10-2017
01:59 AM
Thanks Harsh!! That worked very well. I installed YARN gateway roles on all my nodes, followed by setting 'export HADOOP_CONF_DIR='/etc/hadoop/conf' in my shell action shell scripts. (Didn't use env var for shell action to avoid killing the job). Previously, HADOOP_CONF_DIR was pointing to '/run/cloudera-scm-agent/process/11552-yarn-NODEMANAGER'
... View more
07-09-2017
06:43 AM
I have a CDH 5.3.9 cluster. Earlier, we used Mrv1 across our cluster for all services and clients. Owing to nature of our applications we have to invoke Sqoop and Hive commands within shell action in Oozie. Earlier, this shell action would run properly in distributed way via MRv1. Recently, we moved from Mrv1 to YARN. Everything is running smooth via YARN containers, except Hive and Sqoop commands within Oozie shell action run in 'Local Mapred' mode. They work correctly, but they run in Local mode. When I log into my datanodes ( where my shell actions would run) and manually invoke the Sqoop and Hive commands (tired via various users - yarn, mapred, hdfs) I can see a proper tracking URL for the job being submitted to YARN ( i.e in the non local mode ). I know I am missing passing some configuration details previously not needed by Mrv1. Can someone please help me setup my shell actions. Some more details: I have HA set on my HDFS as well as Mrv2. Oozie and Hive are correctly able to use YARN and submit jobs to it. My shell actions run via YARN. Only problem is sqoop/hive commands within the shell action in Oozie.
... View more
Labels:
10-12-2016
10:45 AM
Actually that is exactly the missing piece I am trying to figure out. I am aware that impala shell will run as whatever user I login as. hdfs user as in my case. However that is not sufficient for the impala shell to access a jar present in HDFS with 700 hdfs permissions. Where hive client shell which similarly runs the shell as hdfs user in my case is able to access. So I am assuming the impalad daemon running as impala user the cause of this ? Authorization is not what I am looking for.
... View more
10-11-2016
04:06 AM
I need some clarity on my chosen solution. I have a CDH 5.3.9 cluster. After assigning roles, I wanted to add some custom UDF's to Hive and Impala. The .jar of the UDF is placed in /user/hdfs in HDFS. /user/hdfs has 700 for hdfs:supergroup For Hive (NOT Hiveserver2), // Version 0.13 Login to Hive CLI as hdfs user Execute create function statement It works. It can access the .jar and create the function. I can test the UDF etc. For Impala, // Version 2.1.7 Login to Impala CLI as hdfs user Execute create function statement It doesn't work since Impala doesnt have permissions to access /user/hdfs If I add impala user to supergroup in Linux, it works since impala is added to HDFS superuser group OR If i give execute permissions to Other users on /user/hdfs If I do a ps aux to see how the CLI is handled for Hive as well as Impala cases, I can see it being run as hdfs user (since I logged in as hdfs) so I assumed it should have access to /user/hdfs for impala as well. But looks like that is not sufficient for impala but works for Hive somehow. Is it because for hive I am using a plain client? and that has access to /user/hdfs since user for login is hdfs? Impala has to run via impalad which runs as impala user and that doesnt have access to the /user/hdfs Can someone please clarify what is going on in here?
... View more
Labels: