Member since
03-21-2016
18
Posts
6
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6386 | 05-26-2017 04:45 PM | |
301 | 05-24-2017 10:25 PM | |
865 | 09-26-2016 12:44 AM |
07-20-2017
12:23 AM
Strangely when I tried to modify the tez.am.resource.memory.mb under Tez yesterday it wasn't taking effect. However, it does seem to be taking effect now. Appreciate your inputs.
... View more
07-18-2017
10:00 PM
I am trying to follow a strange problem I am seeing with Hive on Tez on the sandbox to make things simple. When I run hive cli I see there is a container occupied in YARN via Application for the session. It however is using 500 MB of memory. I am trying to figure out where is this number coming up. The HADOOP_HEAPSIZE is set at 250MB. The minimum container size in YARN is yarn.scheduler.minimum-allocation-mb is 250. The Tez container size in hive-site is set at 250 (hive.tez.container.size). Even after changing hive.tez.container.size I don't see any change in the memory utilization for the single YARN Application container. I've also tried to play around with the tez.am.resource.memory.mb and tez.dag.am.resource.memory.mb but nothing worked. Am I missing something? Is there some kind of calculation with the HADOOP_HEAPSIZE which should be affecting the number to change? I am trying to see if the idle session Hive cli connection memory can be brought down since they keep the resources hooked up on a small cluster.
... View more
Labels:
05-26-2017
04:56 PM
I believe single quote should work. Try --conf 'some.config' --conf 'other.config'.
... View more
05-26-2017
04:53 PM
@Nikita Kiselev Yes it is possible to sync the sAMAccountName for the user from AD/LDAP. In Ranger configuration you have to make sure that the value for ranger.usersync.ldap.user.nameattribute is looking for sAMAccountName instead of CN. If it works do up vote the answer.
... View more
05-26-2017
04:45 PM
4 Kudos
@elliot gimple the correct way to pass multiple configuration options is to specify them individually. The following should work for your example: spark-submit --conf spark.hadoop.parquet.enable.summary-metadata=false --conf spark.yarn.maxAppAttempts=1 As always if you like the answer please up vote the answer.
... View more
05-24-2017
10:25 PM
You can use any editor of your choice. Zeppelin is not an option. You have to write the scripts/programs and store them in the requested location.
... View more
05-24-2017
10:22 PM
1 Kudo
@Vishal Prakash Shah The Hive Metastore database in PgSQL uses upper case object names. In PostgreSQL to access these upper case objects you have to quote those objects. So in the example you provided you will have to change the query to look something like below: SELECT * FROM "TBLS"; HTH
... View more
03-23-2017
07:44 PM
That worked like a charm Vineet. Appreciate that tip. It worked for me as well.
... View more
03-23-2017
07:00 PM
I have a kerberized cluster which is running Ranger KMS. I am trying to run HAWQ on the cluster however I am running into issues when HAWQ segments request containers through YARN Resource Manager. The secure HAWQ installation uses the user "postgres" to request containers. The following is the error message reported in YARN Resource Manager 2017-03-23 10:56:30,816 INFO hdfs.DFSClient (DFSClient.java:getDelegationToken(1043)) - Created HDFS_DELEGATION_TOKEN token 20049 for postgres on 192.168.59.104:8020
2017-03-23 10:56:30,889 WARN security.DelegationTokenRenewer (DelegationTokenRenewer.java:handleDTRenewerAppSubmitEvent(895)) - Unable to add the application to the delegation token renewer.
java.io.IOException: java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:1032)
at org.apache.hadoop.crypto.key.KeyProviderDelegationTokenExtension.addDelegationTokens(KeyProviderDelegationTokenExtension.java:110)
at org.apache.hadoop.hdfs.DistributedFileSystem.addDelegationTokens(DistributedFileSystem.java:2298)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$2.run(DelegationTokenRenewer.java:685)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$2.run(DelegationTokenRenewer.java:680)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.obtainSystemTokensForUser(DelegationTokenRenewer.java:679)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.requestNewHdfsDelegationToken(DelegationTokenRenewer.java:643)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:488)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$800(DelegationTokenRenewer.java:77)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:891)
at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:868)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1742)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.addDelegationTokens(KMSClientProvider.java:1014)
... 16 more
Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: Authentication failed, URL: http://hdp-hdb-200.gagan.com:9292/kms/v1/?op=GETDELEGATIONTOKEN&renewer=rm%2Fhdp-hdb-200.gagan.com%40gagan.com&doAs=postgres&user.name=yarn, status: 403, message: Forbidden
at org.apache.hadoop.security.authentication.client.AuthenticatedURL.extractToken(AuthenticatedURL.java:278)
at org.apache.hadoop.security.authentication.client.PseudoAuthenticator.authenticate(PseudoAuthenticator.java:77)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:132)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:212)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:132)
at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:216)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.doDelegationTokenOperation(DelegationTokenAuthenticator.java:298)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.getDelegationToken(DelegationTokenAuthenticator.java:170)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.getDelegationToken(DelegationTokenAuthenticatedURL.java:371)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider$4.run(KMSClientProvider.java:1019)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider$4.run(KMSClientProvider.java:1014)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
... 17 more
I see the following in the Ranger KMS log (kms.log) 2017-03-23 11:02:00,734 DEBUG LimitLatch - Counting up[http-bio-9292-Acceptor-0] latch=7
2017-03-23 11:02:00,738 DEBUG CoyoteAdapter - The variable [uriBC] has value [/kms/v1/]
2017-03-23 11:02:00,738 DEBUG CoyoteAdapter - The variable [semicolon] has value [-1]
2017-03-23 11:02:00,738 DEBUG CoyoteAdapter - The variable [enc] has value [ISO-8859-1]
2017-03-23 11:02:00,738 DEBUG AuthenticatorBase - Security checking request OPTIONS /kms/v1/
2017-03-23 11:02:00,738 DEBUG RealmBase - No applicable constraints defined
2017-03-23 11:02:00,738 DEBUG AuthenticatorBase - Not subject to any constraint
2017-03-23 11:02:00,738 TRACE StandardWrapper - Returning non-STM instance
2017-03-23 11:02:00,739 DEBUG Http11Protocol - Socket: [org.apache.tomcat.util.net.SocketWrapper@24800623:Socket[addr=/192.168.59.104,port=58547,localport=9292]], Status in: [OPEN_READ], State out: [OPEN]
2017-03-23 11:02:00,758 DEBUG Http11Processor - Error parsing HTTP request header
java.io.EOFException: Unexpected EOF read on the socket
at org.apache.coyote.http11.Http11Processor.setRequestLineReadTimeout(Http11Processor.java:169)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:990)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:625)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:318)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:745)
2017-03-23 11:02:00,758 DEBUG Http11Protocol - Socket: [org.apache.tomcat.util.net.SocketWrapper@24800623:Socket[addr=/192.168.59.104,port=58547,localport=9292]], Status in: [OPEN_READ], State out: [CLOSE
D]
2017-03-23 11:02:00,758 TRACE JIoEndpoint - Closing socket:org.apache.tomcat.util.net.SocketWrapper@24800623:Socket[addr=/192.168.59.104,port=58547,localport=9292] The following is from the Ranger KMS access log: 192.168.59.104 - - [23/Mar/2017:11:02:00 -0700] "OPTIONS /kms/v1/?op=GETDELEGATIONTOKEN&renewer=rm%2Fhdp-hdb-200.gagan.com%40gagan.com&doAs=postgres HTTP/1.1" 401 997
192.168.59.104 - - [23/Mar/2017:11:02:00 -0700] "OPTIONS /kms/v1/?op=GETDELEGATIONTOKEN&renewer=rm%2Fhdp-hdb-200.gagan.com%40gagan.com&doAs=postgres HTTP/1.1" 403 258
192.168.59.104 - - [23/Mar/2017:11:02:00 -0700] "OPTIONS /kms/v1/?op=GETDELEGATIONTOKEN&renewer=rm%2Fhdp-hdb-200.gagan.com%40gagan.com&doAs=postgres&user.name=yarn HTTP/1.1" 401 997
192.168.59.104 - - [23/Mar/2017:11:02:00 -0700] "OPTIONS /kms/v1/?op=GETDELEGATIONTOKEN&renewer=rm%2Fhdp-hdb-200.gagan.com%40gagan.com&doAs=postgres&user.name=yarn HTTP/1.1" 403 258 The following is from the Ranger KMS audit log (kms-audit.log) 2017-03-23 11:02:00,738 UNAUTHENTICATED RemoteHost:192.168.59.104 Method:OPTIONS URL:http://hdp-hdb-200.gagan.com:9292/kms/v1/?op=GETDELEGATIONTOKEN&renewer=rm%2Fhdp-hdb-200.gagan.com%40gagan.com&doAs=pos
tgres ErrorMsg:'Authentication required'
2017-03-23 11:02:00,786 UNAUTHENTICATED RemoteHost:192.168.59.104 Method:OPTIONS URL:http://hdp-hdb-200.gagan.com:9292/kms/v1/?op=GETDELEGATIONTOKEN&renewer=rm%2Fhdp-hdb-200.gagan.com%40gagan.com&doAs=pos
tgres&user.name=yarn ErrorMsg:'Authentication required' I have added the following proxyuser configuration in Ranger KMS as well: hadoop.kms.proxyuser.postgres.users=* hadoop.kms.proxyuser.postgres.hosts=* hadoop.kms.proxyuser.yarn.users=* hadoop.kms.proxyuser.yarn.hosts=* The core-site.xml has the required proxyuser configuration as well: hadoop.proxyuser.postgres.groups=* hadoop.proxyuser.postgres.hosts=* hadoop.proxyuser.yarn.groups=* hadoop.proxyuser.yarn.hosts=* But nothing seem to be working in this case here.
... View more
Labels:
02-17-2017
05:16 PM
Hi Piyush, Additionally, you can try using the following commands: hawq stop segment -M fast If the above does not work stop with option "immediate" should kill the process for you: hawq stop segment -M immediate If you do not provide any option with -M the stop/shutdown uses "smart" mode which does not really kill anything if there is any active connection or query.
... View more
02-17-2017
03:00 PM
Can you try to manually run the following command on the HAWQ Standby Master node? Please run these commands as the user 'gpadmin' source /usr/local/hawq/greenplum_path.sh hawq init standby -a
... View more
09-26-2016
12:44 AM
This turns out to be an issue with something else in HAWQ 2.0.0. Since libhdfs3 and libyarn uses the same kerberos keyfile. The application was failing in the event there wasn't any activity through libyarn for long time which in turn meant that login() function wasn't called. The problem is documented in https://issues.apache.org/jira/browse/HAWQ-940 The issue is identified and addressed. Moreover, there wasn't anything in resource manager logs except for the container release logs.
... View more
09-21-2016
05:25 PM
1 Kudo
Hi Community, I am facing a strange issue with the secure hadoop setup. The challenge is the kerberos ticket_lifetime which is 10 hours. This is different from the default behavior of 24h in general. Please note that the ticket_lifetime limit right now cannot be changed. The job runs on YARN and it is seen that every 10 hours YARN remove the application. I believe this is related to the Delegation Token not getting renewed in timely manner. I don't see any reliable logs to make this determination. My question is: Does this look to be related to the kerberos ticket lifetime? Is there a delegation token renewal for yarn? Or is it just the hdfs? Should I try to modify the value for dfs.namenode.delegation.token.renew-interval? Right now this value is set to 24h (86400000).
... View more
Labels:
09-20-2016
03:51 AM
@Predrag Minovic You need to do a couple of things. First provide the permissions for the repo user in the ranger policy. Which you already did. And second make sure that the repo user has the hdfs user directory setup with the right permissions and ownership.
... View more
06-30-2016
08:53 AM
Out of curiosity, are the nodes (that you are trying to add) able to communicate back with the Ambari server? Have you verified the DNS resolution (with the DNS name found in /etc/ambari-agent/conf/ambari-agent.ini file)? Are the required ports allowed on the Ambari server side (8440)?
... View more
06-30-2016
08:47 AM
Do you have any NFS share mounted on the node? I believe the port might be occupied by some RPC bind service. If there is any NFS share mounted on the node try to unmount the share and then start the name node.
... View more
06-30-2016
08:42 AM
I think this should work for you if you are using PySpark. VAL1 = 'SOME_STRING'
df= HiveContext.sql("SELECT * FROM src WHERE col1 = '%s'" % VAL1)
... View more
03-22-2016
12:59 AM
I would recommend you to change the "desired_state" in the table "servicecomponentdesiredstate" under ambari database. You can update the values where service_name = 'RANGER_KMS'.
... View more