About SuriNuthalapati

SuriNuthalapati · ‎10-24-2017

I have a similar issue in our environmet: we are thinking to usersync ranger with AD. below is the issue I have: AD group name: cfyG_GG-HDP_HadoopAdmins SSD mapped group on linux machine: hadoopadmin This command yields $hdfs groups hdpadmin hdpadmin : hdpadmin hadoopadmin hadoopdev hadoopusers ------------------ Now the problem is I can save the AD group to lower case in ranger as : cfyg_gg-hdp-hadoopadmins but, if I use this group to give permission it wont work, since the linux group name is hadoopadmin, as mapped in SSSD. How can I over come this issue? any help is appreciated. Suri

SuriNuthalapati · ‎04-19-2017

Need to use this command as kafka user.

SuriNuthalapati · ‎04-18-2017

I am trying to integrate Kafka(2.1.x) with Sentry(CDH 5.0.1). When I ran "kafka-sentry -lr" this command i am getting the following errors. Any idea what coyld be wrong here? Note: We have enabled SSL and kerberos which are working fine. #kafka-sentry -lr SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-2.1.1-1.2.1.1.p0.18/lib/kafka/libs/slf4j-log4j12-1.7.21.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-2.1.1-1.2.1.1.p0.18/lib/kafka/libs/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 17/04/18 13:33:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/04/18 13:33:41 WARN security.UserGroupInformation: PriviledgedActionException as:user@RALM.COM (auth:KERBEROS) cause:org.apache.thrift.transport.TTransportException: Peer indicated failure: Problem with callback handler 17/04/18 13:33:41 ERROR tools.SentryShellKafka: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1711) at org.apache.sentry.provider.db.generic.service.thrift.SentryGenericServiceClientDefaultImpl$UgiSaslClientTransport.open(SentryGenericServiceClientDefaultImpl.java:99) at org.apache.sentry.provider.db.generic.service.thrift.SentryGenericServiceClientDefaultImpl.<init>(SentryGenericServiceClientDefaultImpl.java:155) at org.apache.sentry.provider.db.generic.service.thrift.SentryGenericServiceClientFactory.create(SentryGenericServiceClientFactory.java:31) at org.apache.sentry.provider.db.generic.tools.SentryShellKafka.run(SentryShellKafka.java:51) at org.apache.sentry.provider.db.tools.SentryShellCommon.executeShell(SentryShellCommon.java:241) at org.apache.sentry.provider.db.generic.tools.SentryShellKafka.main(SentryShellKafka.java:96) Caused by: org.apache.thrift.transport.TTransportException: Peer indicated failure: Problem with callback handler at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:199) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:307) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.sentry.provider.db.generic.service.thrift.SentryGenericServiceClientDefaultImpl$UgiSaslClientTransport.baseOpen(SentryGenericServiceClientDefaultImpl.java:115) at org.apache.sentry.provider.db.generic.service.thrift.SentryGenericServiceClientDefaultImpl$UgiSaslClientTransport.access$000(SentryGenericServiceClientDefaultImpl.java:71) at org.apache.sentry.provider.db.generic.service.thrift.SentryGenericServiceClientDefaultImpl$UgiSaslClientTransport$1.run(SentryGenericServiceClientDefaultImpl.java:101) at org.apache.sentry.provider.db.generic.service.thrift.SentryGenericServiceClientDefaultImpl$UgiSaslClientTransport$1.run(SentryGenericServiceClientDefaultImpl.java:99) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) ... 6 more The operation failed. Message: Peer indicated failure: Problem with callback handler

SuriNuthalapati · ‎02-22-2017

You achieve it by setting appropriate value: in yarn-site.xml yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds Then yarn will aggreagate the logs for the running jobs too. https://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml Suri

SuriNuthalapati · ‎02-21-2017

Hi, We need to find a way to maintain and search logs for the Long running Sprk streaming jobs on YARN. We have Log aggregation disabled in our cluster. We are thinking about Solr/Elastic search and may be Flume or Kafka to read the Sprk job logs. any suggestions on how to implement search the on these logs and easily manage them? Thanks, Suri

SuriNuthalapati · ‎02-21-2017

You achieve it by setting appropriate value: in yarn-site.xml yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds Then yarn will aggreagate the logs for the running jobs too. https://hadoop.apache.org/docs/r2.6.0/hadoop-yarn/hadoop-yarn-common/yarn-default.xml Suri

SuriNuthalapati · ‎02-21-2017

Thank you, I Will try it out.

SuriNuthalapati · ‎02-21-2017

@mbigelow but from some other sources they said "set the yarn.log-aggregation.retain-check-interval-seconds to specify how often the log retention check should be run. By default, it is one-tenth of the log retention time" - What I understood from this was, it will only check for the retenstion and may not aggregate the logs based on that interval. Did I understood it correct? Suri

SuriNuthalapati · ‎02-21-2017

The documentation for YARN log aggregation says that logs are aggregated after an application completes. Does this rule out the applicability of YARN log aggregation for Spark streaming jobs because in theory streaming jobs run for a much longer duration and potentially don't ever terminate. I want to get the Spark Streaming jobs into HDFS before the job completes; Since Streaming jobs runs forever. Is there a good way to get Spark log data into HDFS? Suri

SuriNuthalapati · ‎02-21-2017

Thanks, @mbigelow. So, if I set yarn.log-aggregation.retain-check-interval-seconds to 60 Seconds, It will send the logs to HDFS (every 60 seconds) even when the job was not finished? (Since streaming jobs run forever) Suri

Online	Offline
Last Visited	‎08-18-2022 11:30 AM

Member Since	‎09-22-2016 09:36 AM
Last Visited	‎08-18-2022 11:30 AM
Posts	33
Kudos received	3

Cloudera Community

Re: kafka-sentry command is not working

Re: Log managmement for Long-running Spark Streami...

Re: Log aggregation for Long running Spark Streami...

Re: Ranger Group Permissions issue - AD and SSSD

Re: kafka-sentry command is not working

kafka-sentry command is not working

Re: Log managmement for Long-running Spark Streami...

Log managmement for Long-running Spark Streaming J...

Re: Log aggregation for Long running Spark Streami...

Re: Log managmement for Long-running Spark Stream...

Re: Log managmement for Long-running Spark Stream...

Log aggregation for Long running Spark Streaming j...

Re: Log managmement for Long-running Spark Stream...