Member since
06-26-2015
31
Posts
6
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
10621 | 05-13-2016 02:45 AM | |
3311 | 05-06-2016 04:30 PM | |
1797 | 12-01-2015 02:35 PM | |
21448 | 09-22-2015 09:29 AM |
05-16-2016
09:17 AM
I have had authorized_proxy_user_config=hue=* configuration on CM, but for some reason it wasn't being populated on impalad configuration. after reading the post you provided me, instead of manually adding it to "advanced snippet", I enabled Sentry Authorization on Impala. Now the configuration appears on the impalad, and impersonation works fine. Thank you for your help Romain Ben
... View more
05-16-2016
07:22 AM
Ahh.. there it is. thank you very much 🙂
... View more
05-13-2016
02:45 AM
2 Kudos
Hi either in your Hue Oozie workflow editor UI (workflow settings -> Hadoop Properties) or on your workflow.xml <workflow-app name="Workflow name" xmlns="uri:oozie:workflow:0.5"> <global> <configuration> <property> <name>oozie.launcher.yarn.app.mapreduce.am.env</name> <value>SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark/</value> </property> </configuration> </global> .....
... View more
05-12-2016
05:34 PM
For me, I had to add following Oozie workflow configuration: oozie.launcher.yarn.app.mapreduce.am.env: SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark yes i know, they could have done better job than this.
... View more
05-12-2016
03:09 PM
for me, it was because some of cloudera-scm-agents were not running (for some reason) I started the agents and then zookeeper, everything came back normal
... View more
05-10-2016
08:01 AM
I added this configuration to Hue Server [impala] impersonation_enabled=True Now i get this error User 'hue/master1@MYREALM.COM' is not authorized to delegate to 'ben'. User delegation is disabled.
... View more
05-09-2016
11:04 AM
yes, you are right, cron-based schedule does not relate to data in or out. then, how can I use traditional dataset based or interval based scheduling on Hue 3.9 (or CDH5)? thanks Ben
... View more
05-06-2016
04:34 PM
Of course if you do this anyone can change resource pool settings using Cloudera Manager REST API or yarn admin command. When you get update error you can check specific user who perform the command from Cloudera Server log, but i didn't bother to check it.
... View more
05-06-2016
04:30 PM
On Cloudera Manager 5.7 I was seeing same problem, but luckily I fixed it by adding this to "YARN Service Advanced Configuration Snippet (Safety Valve) for hadoop-policy.xml" <property> <name>security.resourcemanager-administration.protocol.acl</name> <value>*</value> </property> If you found this helpful, please buy me beer 😉
... View more
05-06-2016
03:28 PM
Hello Why does Hue run Impala query as 'hue/master1@MY-REALM' instead my username 'ben'? I get this error (even i'm logged-in as 'ben' user) Your query has the following error(s): Request from user 'hue/master1@MY-REALM' with requested pool 'it' denied access to assigned pool 'root.it' Previous to Cloudera 5.7 i think Cloudera had llama service and ran Impala query as 'llama' user. Now with Cloudera 5.7 I could have Impala without llama and have it's own Dynamic Resource Management. but problem is Hue runs query as hue/master1... user instead of my username. Similar thing happens to Hive. Hive runs hive query as 'hive' user instead of my username. I found it pretty annoying.. Does anyone has better idea to this? Ben
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala
-
Cloudera Hue
-
Kerberos
05-04-2016
05:29 PM
I have to ask this question Few years ago, before the redesign of Hue interface, I was able to configure dataset, data-in, data-out, input-events on Oozie Coordinators from Hue. However lately with the redesign, I don't see them anymore. It looks like it's replaced with CRON field. WHY!!?? or am I missing something? I'm using Cloudera 5.7 with Hue 3.9 Ben
... View more
Labels:
- Labels:
-
Apache Oozie
-
Cloudera Hue
04-26-2016
09:13 AM
what's the error msg after "Defective tokend detected"? also make sure all of your Cloudera server has correct /etc/krb5.conf file defined.
... View more
12-01-2015
02:35 PM
got it fixed by adding proxy_set_header Host $http_host; proxy_set_header X-Forwarded-For $remote_addr; to location /{ } other workaround is to upgrade to cloudera 5.5 (hue 3.9)
... View more
11-05-2015
03:58 PM
Hi.
I setup Hue load-balancing following instructions on
http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cdh_hag_hue_config.html
So i have nginx running and everything seems working fine.
However when I open a file from File Browswer (http://hueserver:8888/filebrowser/view/filepath)
the page gets redirected to http://hue/filebrowswer/view/filepath, and shows page not found error
i think the unexpected "hue" hostname comes from the upstream name on nginx configuration.
Anyone has any idea how to fix this?
Thanks
Ben
... View more
- Tags:
- hue
Labels:
- Labels:
-
Cloudera Hue
-
Cloudera Manager
10-07-2015
02:40 PM
Hi !
Im using CHD 5.4.0 with kerberos secured cluster
When I run "Refresh Cluster" from Cloudera Manager, I get this message
Failed to update refreshable configuration files in the cluster
this is the stderr
+ chown -R : /var/run/cloudera-scm-agent/process/1056-hdfs-DATANODE-refresh + acquire_kerberos_tgt hdfs.keytab + '[' -z hdfs.keytab ']' + '[' -n '' ']' + '[' validate-writable-empty-dirs = refresh-datanode ']' + '[' file-operation = refresh-datanode ']' + '[' bootstrap = refresh-datanode ']' + '[' failover = refresh-datanode ']' + '[' transition-to-active = refresh-datanode ']' + '[' initializeSharedEdits = refresh-datanode ']' + '[' initialize-znode = refresh-datanode ']' + '[' format-namenode = refresh-datanode ']' + '[' monitor-decommission = refresh-datanode ']' + '[' jnSyncWait = refresh-datanode ']' + '[' nnRpcWait = refresh-datanode ']' + '[' -safemode = '' -a get = '' ']' + '[' monitor-upgrade = refresh-datanode ']' + '[' finalize-upgrade = refresh-datanode ']' + '[' rolling-upgrade-prepare = refresh-datanode ']' + '[' rolling-upgrade-finalize = refresh-datanode ']' + '[' nnDnLiveWait = refresh-datanode ']' + '[' refresh-datanode = refresh-datanode ']' + '[' 3 -lt 3 ']' + DN_ADDR=bda1node02.company.com:50020 + /opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop-hdfs/bin/hdfs --config /var/run/cloudera-scm-agent/process/1056-hdfs-DATANODE-refresh dfsadmin -reconfig datanode bda1node02.company.com:50020 start 15/10/07 16:27:43 WARN security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 15/10/07 16:27:43 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] 15/10/07 16:27:43 WARN security.UserGroupInformation: PriviledgedActionException as:hdfs (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] reconfig: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "bda1node02.company.com/192.168.8.2"; destination host is: "bda1node02.company.com":50020; + RET=255 + '[' 255 -ne 0 ']' + echo 'Unable to start reconfigure task on DataNode bda1node02.company.com:50020.' + exit 255
Any thoughts?
Thank you
Ben
... View more
Labels:
- Labels:
-
Cloudera Manager
-
Kerberos
09-22-2015
09:29 AM
2 Kudos
to answer my own question I need to run this on windows cmd ksetup /addkdc <REALM> <KDC hostname> ksetup /addhosttorealmmap <httpFS hostname> <REALM> and set SPNEGO settings on browser
... View more
09-21-2015
02:11 PM
Hi I got this error while add new host on the cluster. We had to restart the process few times because of misconfigured hostname and TLS setting on CM server. Now after Host Instllation process finishes, CM directs me to Parcel distribution process. However, CM runs parcel distribution process forever. When I check cloudera-scm-server.log, it was spitting out following msg indefinitely 2015-09-21 16:05:34,254 WARN 734582441@scm-web-60:com.cloudera.parcel.ClusterParcelStatus: Parcel not distributed but have active state ACTIVATING I think CM updated status of the new host to ACTIVATING at somepoint previously failed installation attempt. Now it's confused. How can we revert the host status and continue with parcel distribution process? Thanks Ben
... View more
09-21-2015
11:16 AM
Hi Harsh I was thinking of configuring separate NIS server for user/group mapping, however you came in at right time to save me 😄 NSCD in fact resolved the issue. NSCD was turned off on chkconfig by default, we switched it on, now no more heavy CPU load on dc servers. Now I wonder why hadoop.security.groups.cache.secs didn't have effect from the beginning? We haven't specifically configured it on our system, but isn't it set to 300sec by default? Harsh, i know you helped me couple times in the past, thank you very much for your support! Ben
... View more
09-18-2015
03:16 PM
Hi, I have setup our Clouder cluster with Kerbers + AD. A user authenticate through MIT Kerberos to AD, and user/group info is read through ShellBasedUnixGroupsMapping to AD. If we run MR job with 300 simultaneous mappers (total 7000 mappers), our domain controller in AD hits CPU load of 100% throughout the job execution. I executed MR job using Hive after appropritately authenticating with Kerberos server (kinit). Why would MR job cause such CPU load? My initial thought was MR authenticate and gets user/group info once when job starts or hive-shell stats. Apparently my job is constantly overloading AD resources. Does the job casue CPU spike each time mapper gets created? or is it each time mapper access HDFS? And what would be the best way to resolve this issue? Best regards, Ben
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache YARN
-
HDFS
-
Kerberos
07-15-2015
02:48 PM
1 Kudo
Hello !
After some work I finished setting up Cloudera + MIT Kerberos + Windows AD.
from linux machine, I'm able to run "kinit ben@WIN-REALM" and then access hadoop or visit namenode webadmin. Of course I did configure SPNEGO on the web browser.
However, after logging in to my windows machine, which authenticate through windows AD, I can't access namenode webadmin which is at http://namenode:50070.
I tried running kinit from CMD but nothing changes.
this is what i get when i visit namenode webadmin.
HTTP ERROR 403
Problem accessing /index.html. Reason:
GSSException: Defective token detected (Mechanism level: GSSHeader did not find the right tag)
what extra configuration do i need to do on windows to access hadoop webadmin page?
thank you!
Ben
Powered by Jetty://
... View more
Labels:
- Labels:
-
Cloudera Manager
-
Kerberos
07-09-2015
07:46 AM
Tara! Thank very much for your help. Now I understand that the job runs as hive user but the job will go to the designated queue. And after following your steps it worked 🙂 Initially I changed Placement Rules on resource pools and did not have "specified" pool as first rule. Do I need to replace the local /etc/hive/fsxml/fair-scheduler.xml everytime I make changes to the "Dynamic Resource Pools"? I'm using CM cluster. Best, Ben
... View more
07-06-2015
04:04 PM
I tested on both hive and beeline, and running from command line works as intented. jobs get assigned to correct user/group queues. Can you explain why it's ok for hive jobs get submitted as 'hive' user? We have four different teams using Cloudera, and it gets difficult to manage resources if all hive jobs go to "root.hive" queue. And since "root.hive" queue has limited resouces allocated, most hive jobs will fail. This is our job history. application_1436195699910_0031 hive INSERT INTO TABLE ...(Stage-1) MAPREDUCE root.hive Mon Jul 6 15:44:38 -0500 2015 Mon Jul 6 15:45:11 -0500 2015 FINISHED SUCCEEDED application_1436195699910_0030 ben oozie:launcher:T=hive2:W=JobName:A=hive2-6df2:ID=0000004-150706101622653-oozie-oozi-W MAPREDUCE root.infra Mon Jul 6 15:44:22 -0500 2015 Mon Jul 6 15:45:21 -0500 2015 FINISHED SUCCEEDED other workflow actions such as sqoop/pig run on correct user/group queue. I think this is problem with our cluster configuration, but please guide us with right direction 🙂 thank you for your help Ben
... View more
07-06-2015
06:38 AM
what's the issue tracking URL on 5.2.1 release? can't find it on Google 😞
... View more
07-06-2015
06:17 AM
sorry for late response Darren, I'm using CDH 5.4.1. This doesn't happen from command-line. If I'm authenticated as ben on shell environment, then the job gets submitted as ben. On hue+oozie environment, if I submit a workflow job, oozie job-launcher get's submitted as the authenticated user ben. However actual hive job gets submitted as hive user. Thank you. Ben
... View more
06-26-2015
03:35 PM
Hi I have enabled Sentry to work with HiveServer2 with Kerberos Authentication. Therefore, impersonication on HiveServer2 is turned off. Now all queries are run as 'hive' from Hue Hive UI, and oozie hive action. How does resource management (YARN resource pool) works in this case? I want jobs to go into right pool, but now all Hive jobs are going into root.hive pool. Samething happens with Impala when using llma. All impala jobs goes into root.llama pool. Thank you Ben
... View more
Labels: