Member since
07-17-2017
23
Posts
1
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
319 | 04-30-2020 12:06 PM | |
436 | 11-06-2019 12:32 PM | |
531 | 08-22-2019 09:13 AM | |
334 | 08-22-2019 08:39 AM |
10-08-2020
06:59 AM
Sofiane, I apologize but I don't have notes from that point anymore sadly. If I come across anything I'll definitely update this page. Sorry!
... View more
04-30-2020
12:06 PM
So after a few days of struggling, I was able to launch a trial CDP-DC cluster on my server and explore myself. I installed the runtime along with most of the CDF parcels. It looks as though the two services mentioned above (Zeppelin and Druid) are not currently available on CDP-DC. If anyone has any information on how we could add these services or any timelines for these services being added, I'd love to know, especially for Druid!
... View more
04-24-2020
08:35 AM
I'm trying to discern what services will be available to us when we upgrade from CDH 5.16.2 to CDP-DC with CDF. From what I can tell, we should have access to any components part of CDF, as well as anything in the Cloudera Runtime. The thing is, I'm seeing some discrepancies.
In the maven artifacts for Cloudera Runtime 7.0.3, it mentions a lot of services, including Druid and Zeppelin: https://docs.cloudera.com/cdpdc/7.0/release-guide/topics/cdpdc-runtime-maven-703.html
But in the component versions list of Cloudera Runtime 7.0.3, Druid and Zeppelin are both not included: https://docs.cloudera.com/runtime/7.0.3/release-notes/topics/rt-runtime-component-versions.html
Additionally, in the documentation for CDF, I don't see Druid listed, even though in the past HDF has had Druid included:
https://www.cloudera.com/content/dam/www/marketing/images/diagrams/cdf-diagram.png
Can someone verify for me whether Druid and/or Zeppelin is available for use by users of CDP-DC with CDF? Additionally, how should I understand the discrepancies between the first two links above?
Thanks!
... View more
11-21-2019
10:38 AM
I have an HDP 2.6.1 cluster where we’ve had yarn.log-aggregation.retain-seconds set to 30 days for a while, and everything was working properly. Four days ago we changed the property to 15 days instead and restarted the services. The check interval is set to the default, so we expected within 1.5 days, we’d see the logs older than 15 days deleted. For some reason, we are still seeing 30 days of logs kept. The other properties all seem to be set properly. The only weird setting I can find is that we are using the LogAggregationIndexedFileController as our primary file controller class. The LogAggregationTFileController is still available as the second in the list. I found YARN-8279 (https://issues.apache.org/jira/browse/YARN-8279), which seems sort of related, except that we are still seeing logs being put into the right suffix folder, and it still seems to be deleting logs older than 30 days. It just doesn’t seem to have updated to 15 days as the cutoff instead. I’ve looked in the logs for the Resource Manager, Timeline Server, and one of the Name Nodes, and nothing that would explain this has popped up. Any ideas where to go to figure out what is happening? Additionally, can someone confirm in which process the deletion service actually runs? Is it the resource manager, timeline server, or something else?
... View more
11-06-2019
12:32 PM
After looking into this some more, we found the error trace below the first time that a paragraph was called after the interpreter was restarted. This didn't show up originally since the above log was only trying to run a paragraph, not necessarily just after the interpreter was restarted. As you can see, in the end there is an exception about a class not being accessible. Once we made sure the wandisco class was accessible to the interpreter in the classpath, then everything started to work properly. 2019-11-06 10:24:48,850 ERROR [pool-2-thread-2] PhoenixInterpreter:108 - Cannot open connection
java.sql.SQLException: ERROR 103 (08004): Unable to establish connection.
at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:386)
at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:288)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.access$300(ConnectionQueryServicesImpl.java:171)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1881)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1860)
at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1860)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:162)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:131)
at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:133)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at org.apache.zeppelin.phoenix.PhoenixInterpreter.open(PhoenixInterpreter.java:99)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:493)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:410)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:319)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
at org.apache.phoenix.query.HConnectionFactory$HConnectionFactoryImpl.createConnection(HConnectionFactory.java:47)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:286)
... 22 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
... 27 more
Caused by: java.lang.NoClassDefFoundError: com/wandisco/shadow/com/google/protobuf/InvalidProtocolBufferException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1844)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1809)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1903)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2573)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2586)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2625)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2607)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
... View more
11-05-2019
09:33 AM
I've verified that I can access Phoenix through sqlline.py and psql.py using the configuration in /etc/ams-hbase/conf, and run queries as the activity-explorer user that I'm trying to run through Zeppelin. One thing of note with all this: we've changed the ZNode parent from ams-hbase-secure1 to ams-hbase-secure2. I've verified that the value in /etc/ams-hbase/conf/hbase-site.xml holds the new value, but the value in /etc/ams-metrics-collector/conf/hbase-site.xml is the old value and hasn't been updated recently. activity-env.sh points to /etc/ams-hbase/conf, so I believe this shouldn't be an issue, but it was a bit confusing when I first came across it.
... View more
11-05-2019
08:42 AM
I just noticed that the SmartSense Activity Explorer Zeppelin Notebooks have been failing to run on a production HDP 2.6.1 cluster. I'm not sure how long the issue has been occurring since the dashboards haven't been used much until now. Whenever we try to run the paragraphs, we immediately get an error about unable to make a connection. No other information is given. We are able to connect to Phoenix through psql.py, so we know Phoenix is working properly, just not the dashboard. We've tried restarting the activity explorer, which hasn't fixed the issue. Has someone seen this issue? Any ideas? I'm including the logs we are seeing below. ==> activity-explorer.log <==
2019-11-05 10:34:42,555 INFO [qtp1209702763-1653] NotebookServer:711 - New operation from 10.142.131.4 : 62057 : admin : GET_NOTE : 2BPD7951H
2019-11-05 10:34:42,558 WARN [qtp1209702763-1653] VFSNotebookRepo:292 - Get Note revisions feature isn't supported in class org.apache.zeppelin.notebook.repo.VFSNotebookRepo
2019-11-05 10:34:45,886 INFO [pool-2-thread-31] SchedulerFactory:131 - Job paragraph_1490380022011_880344082 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpretershared_session255064451
2019-11-05 10:34:45,887 INFO [pool-2-thread-31] Paragraph:366 - run paragraph 20160728-152731_1797959357 using null org.apache.zeppelin.interpreter.LazyOpenInterpreter@4a66e7be
==> zeppelin-interpreter-phoenix-phoenix--<HOSTNAME> <==
2019-11-05 10:34:45,889 INFO [pool-2-thread-4] SchedulerFactory:131 - Job remoteInterpretJob_1572971685889 started by scheduler org.apache.zeppelin.phoenix.PhoenixInterpreter717591913
2019-11-05 10:34:45,889 INFO [pool-2-thread-4] PhoenixInterpreter:192 - Run SQL command 'SELECT file_size_category as "Size category",
total_files as "Total files",
avg_file_size as "Avg file size"
FROM (
SELECT CASE WHEN file_size_range_end <= 10000 THEN 'Tiny (0-10K)'
WHEN file_size_range_end <= 1000000 THEN 'Mini (10K-1M)'
WHEN file_size_range_end <= 30000000 THEN 'Small (1M-30M)'
WHEN file_size_range_end <= 128000000 THEN 'Medium (30M-128M)'
ELSE 'Large (128M+)'
END as file_size_category,
sum(file_count) as total_files,
(sum(total_size) / sum(file_count)) as avg_file_size
FROM ACTIVITY.HDFS_USER_FILE_SUMMARY
WHERE analysis_date in ( SELECT MAX(analysis_date)
FROM ACTIVITY.HDFS_USER_FILE_SUMMARY)
GROUP BY file_size_category
)'
2019-11-05 10:34:45,890 INFO [pool-2-thread-4] SchedulerFactory:137 - Job remoteInterpretJob_1572971685889 finished by scheduler org.apache.zeppelin.phoenix.PhoenixInterpreter717591913
==> activity-explorer.log <==
2019-11-05 10:34:45,891 WARN [pool-2-thread-31] NotebookServer:2067 - Job 20160728-152731_1797959357 is finished, status: ERROR, exception: null, result: %text ERROR 103 (08004): Unable to establish connection.
2019-11-05 10:34:45,909 INFO [pool-2-thread-31] SchedulerFactory:137 - Job paragraph_1490380022011_880344082 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpretershared_session255064451
... View more
08-22-2019
09:16 AM
So the exception you have listed indicates that the JDBC driver was installed properly, since the callstack includes code in the MySQL driver. I would check that the hostname and port you supplied are correct, and that you can access that node from the node you are working on. The commands below would be a good starting point. ping msl-dpe-perf80-100g.msl.lab telnet msl-dpe-perf80-100g.msl.lab 3306
... View more
08-22-2019
09:13 AM
So EOL just means that Python 2 will no longer be supported. That doesn't mean anything using Python 2 will not be supported. Any applications that currently use Python 2.7 will still work, but any bugs in Python 2.7 won't be fixed. Additionally, pip will not work with Python 2.7 after version 19.1. As long as the version of pip you have installed is less than this, you'll still be able to use Python 2.7 for a while, until the repository itself is taken down or reconfigured in a way that breaks old pip. See https://stackoverflow.com/questions/54915381/will-pip-work-for-python-2-7-after-its-end-of-life-on-1st-jan-2020 for more details on that. I am not sure on the plans for HDP/CDH/CDP support for Python 2.7 going forward, so that's a question for the dev team directly.
... View more
08-22-2019
08:39 AM
1. I'm not sure I understand what you mean by communicate. When SSSD is first started, it will sync all of the users and groups in AD to the local node, so any existing users will be able to log in, and have the correct groups ready for them (assuming configuration is set up properly). 2. Rolling back SSSD is possible but troublesome. It would consist of stopping the service and uninstalling it from the node. I'm not sure if the users and groups would still be on the node, but you would need to uninstall that as well. There may be some other pieces left around, but none that I would expect to cause any differences, unless you were to try to install SSSD again.
... View more
06-11-2019
06:25 AM
I am implementing an HDP 3.1/HDF 3.3 cluster currently, secured using MIT KDC and OpenLDAP server. At one point I had the ability to access Nifi through the Knox proxy, but after adding encryption everywhere, I no longer can do so. I can log into Nifi using my LDAP credentials when I access Nifi directly just fine. Whenever I try to access through Knox, however, I first am shown to Nifi as anonymous (which is rejected by Ranger), and then, once I log in, it shows some Kerberos output that seems to show I was successful, but then it shows the following screen to me in my browser. The same process happens whether I use the topology which authenticates against LDAP, or the topology which uses anonymous authentication, for services such as Ambari or Atlas that do their own authentication. When I first access the Nifi page through Knox (at which point it takes me to the login page), I see this, even if I'm logged in already to Knox. 2019-06-11 01:35:32,400 DEBUG [NiFi Web Server-223] o.a.n.w.s.NiFiAuthenticationFilter Checking secure context token: null 2019-06-11 01:35:32,401 DEBUG [NiFi Web Server-223] o.a.n.w.s.x509.X509CertificateExtractor No client certificate found in request. 2019-06-11 01:35:32,401 DEBUG [NiFi Web Server-223] o.a.n.w.s.NiFiAuthenticationFilter Checking secure context token: null 2019-06-11 01:35:32,401 DEBUG [NiFi Web Server-223] o.a.n.w.s.NiFiAuthenticationFilter Checking secure context token: null 2019-06-11 01:35:32,401 DEBUG [NiFi Web Server-223] o.a.n.w.s.NiFiAuthenticationFilter Checking secure context token: null 2019-06-11 01:35:32,401 DEBUG [NiFi Web Server-223] o.a.n.w.s.a.NiFiAnonymousUserFilter Populated SecurityContextHolder with anonymous token: 'anonymous' 2019-06-11 01:35:32,402 INFO [NiFi Web Server-223] o.a.n.w.a.c.AccessDeniedExceptionMapper identity[anonymous], groups[none] does not have permission to access the requested resource. Unable to view the user interface. Returning Unauthorized response. 2019-06-11 01:35:32,403 DEBUG [NiFi Web Server-223] o.a.n.w.a.c.AccessDeniedExceptionMapper org.apache.nifi.authorization.AccessDeniedException: Unable to view the user interface. at org.apache.nifi.authorization.resource.Authorizable.authorize(Authorizable.java:285) at org.apache.nifi.authorization.resource.Authorizable.authorize(Authorizable.java:298) at org.apache.nifi.web.api.FlowResource.lambda$authorizeFlow$0(FlowResource.java:226) at org.apache.nifi.web.StandardNiFiServiceFacade.authorizeAccess(StandardNiFiServiceFacade.java:374) ... This is the only thing of interest that comes out in the Nifi Logs when I try to log in directly with Nifi through Knox: 2019-06-11 01:34:07,093 DEBUG [NiFi Web Server-21] o.a.n.w.s.x509.X509CertificateExtractor No client certificate found in request. Any ideas what the issue is, or where I need to look to solve this? Neither the Knox logs or the Nifi logs seem to indicate why the log-in portion doesn't seem to work properly.
... View more
Labels:
05-31-2019
02:39 PM
I have a HDP 3.1 cluster on EC2 instances in AWS in which I am trying to use proxying to access various UIs. I've verified that WEBHDFS works properly. Oozie, however, just shows the top, and then quits. See below for an example. Accessing the web UI directly works correctly. In the Oozie server logs, I found the following output. java.lang.IllegalArgumentException: proxyUser cannot be null, If you're attempting to use user-impersonation via a proxy user, please make sure that oozie.service.ProxyUserService.proxyuser.#USER#.hosts and oozie.service.ProxyUserService.proxyuser.#USER#.groups are configured correctly at org.apache.oozie.util.ParamChecker.notEmpty(ParamChecker.java:87) at org.apache.oozie.service.ProxyUserService.validate(ProxyUserService.java:132) at org.apache.oozie.servlet.JsonRestServlet.getUser(JsonRestServlet.java:567) at org.apache.oozie.servlet.JsonRestServlet.service(JsonRestServlet.java:296) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.oozie.servlet.AuthFilter$2.doFilter(AuthFilter.java:171) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592) at org.apache.oozie.servlet.AuthFilter.doFilter(AuthFilter.java:176) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.oozie.servlet.HostnameFilter.doFilter(HostnameFilter.java:86) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.oozie.servlet.OozieXFrameOptionsFilter.doFilter(OozieXFrameOptionsFilter.java:48) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.oozie.servlet.OozieCSRFFilter.doFilter(OozieCSRFFilter.java:62) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:234) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:610) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:503) at java.lang.Thread.run(Thread.java:748) The thing is, the following is set in my oozie-site.xml file, and I've set similar things in core-site.xml: oozie.service.ProxyUserService.proxyuser.knox.groups=* oozie.service.ProxyUserService.proxyuser.knox.hosts=* All of the other UIs other than WebHDFS and Oozie just show a blank screen with no source at all. I haven't found any relevant logs in those either. Ranger is not showing any denials. Any ideas on what would be causing this, and how I can get Oozie UI to show up properly?
... View more
Labels:
11-28-2017
02:51 PM
I am trying to automate the configuration of client Windows 7-10 PCs that need to connect to Impala through ODBC. I am trying to use odbcconf, but running into an issue that appears to be specific to the Impala driver. When I run the following command, a ODBC configuration window pops up, which has no configuration in it at all. When I run it for a SQL Server DSN, however, it returns immediately, and I can see the DSN in ODBC Administrator. The only difference is the driver. See below for the command and a screenshot of what I see. odbcconf CONFIGSYSDSN "Cloudera ODBC Driver for Impala" "DSN=Testing|Server=server" Any ideas on what I need to do in order to allow for the creation of an Impala DSN using odbcconf?
... View more
Labels:
07-27-2017
08:39 AM
OK, finally got everything working. As for the last error I had been seeing, I had thought for sure my kerberos credentials were still showing up in klist, but this morning when I kinited in, everything worked fine, so that must have been the issue. I then got an error on the consumer side, which I soon realized was because with the new bootstrap-servers parameter, you need to use the same port as the producer (9093 in my case), not the zookeeper port. Once I updated this, everything worked properly.
... View more
07-27-2017
08:29 AM
After researching this a bit, I tried a few more things, none of which changed the error: I moved to using the keytab for the kafka user (https://community.cloudera.com/t5/Data-Ingestion-Integration/Unable-to-connect-to-kerberized-Kafka-2-1-0-10-from-Spark-2-1/m-p/56026) I placed the jaas.conf file in /tmp on every node in the cluster and pointed KAFKA_OPTS to that place (https://stackoverflow.com/questions/43190784/spark-streaming-kafka-kerberos) I exported KAFKA_CLIENT_KERBEROS_PARAMS to be the same as KAFKA_OPTS (https://community.hortonworks.com/content/supportkb/49422/running-kafka-client-bin-scripts-in-secure-envrion.html)
... View more
07-26-2017
08:06 PM
I realized that the client.properties file was using SASL_PLAINTEXT, not SASL_SSL. Updated appropriately. Hitting a new error now, on both producer and consumer. The following error comes up, and then it quits the program. I've verified that jaas.conf is in KAFKA_OPTS properly. org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:718)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:597)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:579)
at kafka.consumer.NewShinyConsumer.<init>(BaseConsumer.scala:53)
at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:69)
at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:50)
at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
Caused by: org.apache.kafka.common.KafkaException: javax.security.auth.login.LoginException: Could not login: the client is being asked for a password, but the Kafka client code does not currently support obtaining a password from the user. not available to garner authentication information from the user
at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:93)
at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:109)
at org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:55)
at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:84)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:657)
... 6 more
Caused by: javax.security.auth.login.LoginException: Could not login: the client is being asked for a password, but the Kafka client code does not currently support obtaining a password from the user. not available to garner authentication information from the user
at com.sun.security.auth.module.Krb5LoginModule.promptForPass(Krb5LoginModule.java:899)
at com.sun.security.auth.module.Krb5LoginModule.attemptAuthentication(Krb5LoginModule.java:719)
at com.sun.security.auth.module.Krb5LoginModule.login(Krb5LoginModule.java:584)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at javax.security.auth.login.LoginContext.invoke(LoginContext.java:762)
at javax.security.auth.login.LoginContext.access$000(LoginContext.java:203)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:690)
at javax.security.auth.login.LoginContext$4.run(LoginContext.java:688)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:687)
at javax.security.auth.login.LoginContext.login(LoginContext.java:595)
at org.apache.kafka.common.security.authenticator.AbstractLogin.login(AbstractLogin.java:55)
at org.apache.kafka.common.security.kerberos.KerberosLogin.login(KerberosLogin.java:100)
at org.apache.kafka.common.security.authenticator.LoginManager.<init>(LoginManager.java:52)
at org.apache.kafka.common.security.authenticator.LoginManager.acquireLoginManager(LoginManager.java:81)
at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:85) Also, I ran the command you mentioned above, and everything looks right. SSL handshake read 3151 bytes and wrote 499 bytes using TLS v1.2. If you need more information from it, let me know. EDIT: Realized that the properties file actually was wrong. Updating with relevant information because of this.
... View more
07-26-2017
02:06 PM
OK, so it looks like that took care of one problem, but there's still a problem with the consumer. Following your instructions, I found that the kafka broker was operating on port 9093, not 9092. Fixing that on the producer then caused the same EOF error to come up as I am seeing on the consumer. 17/07/26 16:04:25 DEBUG authenticator.SaslClientAuthenticator: Set SASL client state to SEND_HANDSHAKE_REQUEST
17/07/26 16:04:25 DEBUG authenticator.SaslClientAuthenticator: Creating SaslClient: client=svcnonprodhadoop@<DOMAIN>;service=kafka;serviceHostname=svd0hdatn01;mechs=[GSSAPI]
17/07/26 16:04:25 DEBUG network.Selector: Created socket with SO_RCVBUF = 32768, SO_SNDBUF = 102400, SO_TIMEOUT = 0 to node -1
17/07/26 16:04:25 DEBUG authenticator.SaslClientAuthenticator: Set SASL client state to RECEIVE_HANDSHAKE_RESPONSE
17/07/26 16:04:25 DEBUG clients.NetworkClient: Completed connection to node -1. Fetching API versions.
17/07/26 16:04:25 DEBUG network.Selector: Connection with svd0hdatn01/10.96.88.42 disconnected
java.io.EOFException
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:99)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.receiveResponseOrToken(SaslClientAuthenticator.java:242)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.authenticate(SaslClientAuthenticator.java:166)
at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:71)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:350)
at org.apache.kafka.common.network.Selector.poll(Selector.java:303)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:370)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:225)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:126)
at java.lang.Thread.run(Thread.java:745)
17/07/26 16:04:25 DEBUG clients.NetworkClient: Node -1 disconnected.
17/07/26 16:04:25 DEBUG clients.NetworkClient: Give up sending metadata request since no node is available
17/07/26 16:04:25 DEBUG clients.NetworkClient: Initialize connection to node -1 for sending metadata request
17/07/26 16:04:25 DEBUG clients.NetworkClient: Initiating connection to node -1 at svd0hdatn01:9093.
... View more
07-26-2017
12:50 PM
Ah OK, I apologize, I didn't realize the logs were separately controlled. When I enabled that, both consumer and producer come back with errors constantly. The consumer shows the following stack trace as soon as it is started constantly until I close the consumer: 17/07/26 14:44:40 DEBUG authenticator.SaslClientAuthenticator: Set SASL client state to SEND_HANDSHAKE_REQUEST
17/07/26 14:44:40 DEBUG authenticator.SaslClientAuthenticator: Creating SaslClient: client=svcnonprodhadoop@<DOMAIN>;service=kafka;serviceHostname=svd0hdatn01.<DOMAIN>;mechs=[GSSAPI]
17/07/26 14:44:40 DEBUG network.Selector: Created socket with SO_RCVBUF = 65536, SO_SNDBUF = 124928, SO_TIMEOUT = 0 to node -1
17/07/26 14:44:40 DEBUG authenticator.SaslClientAuthenticator: Set SASL client state to RECEIVE_HANDSHAKE_RESPONSE
17/07/26 14:44:40 DEBUG clients.NetworkClient: Completed connection to node -1. Fetching API versions.
17/07/26 14:44:40 DEBUG network.Selector: Connection with svd0hdatn01.<DOMAIN>/10.96.88.42 disconnected
java.io.EOFException
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:83)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.receiveResponseOrToken(SaslClientAuthenticator.java:242)
at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.authenticate(SaslClientAuthenticator.java:166)
at org.apache.kafka.common.network.KafkaChannel.prepare(KafkaChannel.java:71)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:350)
at org.apache.kafka.common.network.Selector.poll(Selector.java:303)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:370)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:226)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:203)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:138)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:219)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:196)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:281)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:996)
at kafka.consumer.NewShinyConsumer.<init>(BaseConsumer.scala:55)
at kafka.tools.ConsoleConsumer$.run(ConsoleConsumer.scala:69)
at kafka.tools.ConsoleConsumer$.main(ConsoleConsumer.scala:50)
at kafka.tools.ConsoleConsumer.main(ConsoleConsumer.scala)
17/07/26 14:44:40 DEBUG clients.NetworkClient: Node -1 disconnected.
17/07/26 14:44:40 DEBUG clients.NetworkClient: Give up sending metadata request since no node is available
17/07/26 14:44:40 DEBUG clients.NetworkClient: Initialize connection to node -1 for sending metadata request
17/07/26 14:44:40 DEBUG clients.NetworkClient: Initiating connection to node -1 at svd0hdatn01.<DOMAIN>:2181. The producer shows the following log output as soon as any input is given to put into the topic: 17/07/26 14:45:43 DEBUG authenticator.SaslClientAuthenticator: Set SASL client state to SEND_HANDSHAKE_REQUEST
17/07/26 14:45:43 DEBUG authenticator.SaslClientAuthenticator: Creating SaslClient: client=svcnonprodhadoop@<DOMAIN>;service=kafka;serviceHostname=svd0hdatn01.<DOMAIN>;mechs=[GSSAPI]
17/07/26 14:45:43 DEBUG network.Selector: Connection with svd0hdatn01.<DOMAIN>/<IP_ADDRESS> disconnected
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:81)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:335)
at org.apache.kafka.common.network.Selector.poll(Selector.java:303)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:370)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:225)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:126)
at java.lang.Thread.run(Thread.java:745)
17/07/26 14:45:43 DEBUG clients.NetworkClient: Node -1 disconnected.
17/07/26 14:45:43 DEBUG clients.NetworkClient: Give up sending metadata request since no node is available
17/07/26 14:45:44 DEBUG clients.NetworkClient: Initialize connection to node -1 for sending metadata request
17/07/26 14:45:44 DEBUG clients.NetworkClient: Initiating connection to node -1 at svd0hdatn01.<DOMAIN>:9092.
... View more
07-26-2017
12:26 PM
I've set the Kafka Broker Logging Threshold to DEBUG, and am seeing DEBUG statements in the Kafka Broker logs. It obviously puts out a lot of information, but I haven't come across anything that looked to be interesting or useful. This cluster does not have a gateway instance at all.
... View more
07-26-2017
11:44 AM
I recently installed Kafka onto an already secured cluster. I've configured Kafka to use Kerberos and SSL, and set the protocol to SASL_SSL, roughly following the documentation here (I used certificates already created): https://www.cloudera.com/documentation/kafka/latest/topics/kafka_security.html When I bring up kafka-console-consumer, a few minor log messages come up, and then it sits waiting for messages correctly. When I bring up kafka-console-producer, the same happens. I am pointing both to the same node which is both a Kafka broker and a Zookeeper node, with port 9092 for the produer, and port 2181 for the consumer. If I type something into the console for the producer, however, nothing will happen for a while, and then I will get the following error: 17/07/26 13:11:20 ERROR internals.ErrorLoggingCallback: Error when sending message to topic test with key: null, value: 5 bytes with error:
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms. The Kafka logs in that timeframe don't seem to have any errors or warnings. The Zookeeper logs are also clean except for one warning that shows up only in the log of the zookeeper node I am pointing the consumer to: 2017-07-26 13:10:17,379 WARN org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of session 0x0 due to java.io.EOFException Any ideas on what would cause this behavior or how to further debug what the issue is?
... View more
07-17-2017
02:11 PM
1 Kudo
Found the issue. In /var/log/*kafka*, there was a log file that had an exception logged in it that I've included below. After I went back and enabled Kerberos authorization for Kafka (I had not enabled it thinking I could do that later), everything started successfully. Thanks for pointing me to that log file! java.lang.RuntimeException: Unable to create KafkaAuthBinding: keytabFile required because kerberos is enabled
at org.apache.sentry.kafka.binding.KafkaAuthBindingSingleton.configure(KafkaAuthBindingSingleton.java:67)
at org.apache.sentry.kafka.authorizer.SentryKafkaAuthorizer.configure(SentryKafkaAuthorizer.java:120)
at kafka.server.KafkaServer$$anonfun$startup$4.apply(KafkaServer.scala:236)
at kafka.server.KafkaServer$$anonfun$startup$4.apply(KafkaServer.scala:234)
at scala.Option.map(Option.scala:146)
at kafka.server.KafkaServer.startup(KafkaServer.scala:234)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:39)
at kafka.Kafka$.main(Kafka.scala:67)
at com.cloudera.kafka.wrap.Kafka$.main(Kafka.scala:76)
at com.cloudera.kafka.wrap.Kafka.main(Kafka.scala)
Caused by: java.lang.IllegalArgumentException: keytabFile required because kerberos is enabled
at org.apache.sentry.kafka.binding.KafkaAuthBinding.initKerberos(KafkaAuthBinding.java:561)
at org.apache.sentry.kafka.binding.KafkaAuthBinding.createAuthProvider(KafkaAuthBinding.java:142)
at org.apache.sentry.kafka.binding.KafkaAuthBinding.<init>(KafkaAuthBinding.java:97)
at org.apache.sentry.kafka.binding.KafkaAuthBindingSingleton.configure(KafkaAuthBindingSingleton.java:63)
... 9 more
... View more
07-17-2017
02:07 PM
I've not seen anything else in either stdout or stderr. I'm attaching the entirety of stdout here for posterity. The stderr can be found at the link below: https://pastebin.com/CguhWiqh Mon Jul 17 14:32:38 CDT 2017
JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
Using -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/kafka_kafka-KAFKA_BROKER-c1b2741bf1f3e232c6c549ec045bcb79_pid41162.hprof -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh as CSD_JAVA_OPTS
Using /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER as conf dir
Using scripts/control.sh as process script
Date: Mon Jul 17 14:32:38 CDT 2017
Host: svd0hdatn03.XXXX.corp
Pwd: /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER
CONF_DIR: /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER
KAFKA_HOME: /opt/cloudera/parcels/KAFKA-2.2.0-1.2.2.0.p0.68/lib/kafka
Zookeeper Quorum: svd0hdatn01.XXXX.corp:2181,svd0hmstn01.XXXX.corp:2181,svd0hmstn02.XXXX.corp:2181
Zookeeper Chroot:
PORT: 9092
JMX_PORT: 9393
SSL_PORT: 9093
ENABLE_MONITORING: true
METRIC_REPORTERS: nl.techop.kafka.KafkaHttpMetricsReporter
BROKER_HEAP_SIZE: 1024
BROKER_JAVA_OPTS: -server -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark -XX:+DisableExplicitGC -Djava.awt.headless=true
BROKER_SSL_ENABLED: false
KERBEROS_AUTH_ENABLED: false
KAFKA_PRINCIPAL: kafka/svd0hdatn03.XXXX.corp@XXXX.CORP
SECURITY_INTER_BROKER_PROTOCOL: INFERRED
AUTHENTICATE_ZOOKEEPER_CONNECTION: true
SUPER_USERS: kafka
Kafka version found: 0.10.2-kafka2.2.0
ZK_PRINCIPAL_NAME: zookeeper
Final Zookeeper Quorum is svd0hdatn01.XXXX.corp:2181,svd0hmstn01.XXXX.corp:2181,svd0hmstn02.XXXX.corp:2181
security.inter.broker.protocol inferred as PLAINTEXT
LISTENERS=listeners=PLAINTEXT://svd0hdatn03.XXXX.corp:9092,
Mon Jul 17 14:32:43 CDT 2017
JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
Using -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/kafka_kafka-KAFKA_BROKER-c1b2741bf1f3e232c6c549ec045bcb79_pid41853.hprof -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh as CSD_JAVA_OPTS
Using /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER as conf dir
Using scripts/control.sh as process script
Date: Mon Jul 17 14:32:43 CDT 2017
Host: svd0hdatn03.XXXX.corp
Pwd: /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER
CONF_DIR: /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER
KAFKA_HOME: /opt/cloudera/parcels/KAFKA-2.2.0-1.2.2.0.p0.68/lib/kafka
Zookeeper Quorum: svd0hdatn01.XXXX.corp:2181,svd0hmstn01.XXXX.corp:2181,svd0hmstn02.XXXX.corp:2181
Zookeeper Chroot:
PORT: 9092
JMX_PORT: 9393
SSL_PORT: 9093
ENABLE_MONITORING: true
METRIC_REPORTERS: nl.techop.kafka.KafkaHttpMetricsReporter
BROKER_HEAP_SIZE: 1024
BROKER_JAVA_OPTS: -server -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark -XX:+DisableExplicitGC -Djava.awt.headless=true
BROKER_SSL_ENABLED: false
KERBEROS_AUTH_ENABLED: false
KAFKA_PRINCIPAL: kafka/svd0hdatn03.XXXX.corp@XXXX.CORP
SECURITY_INTER_BROKER_PROTOCOL: INFERRED
AUTHENTICATE_ZOOKEEPER_CONNECTION: true
SUPER_USERS: kafka
Kafka version found: 0.10.2-kafka2.2.0
ZK_PRINCIPAL_NAME: zookeeper
Final Zookeeper Quorum is svd0hdatn01.XXXX.corp:2181,svd0hmstn01.XXXX.corp:2181,svd0hmstn02.XXXX.corp:2181
security.inter.broker.protocol inferred as PLAINTEXT
LISTENERS=listeners=PLAINTEXT://svd0hdatn03.XXXX.corp:9092,
Mon Jul 17 14:32:49 CDT 2017
JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
Using -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/kafka_kafka-KAFKA_BROKER-c1b2741bf1f3e232c6c549ec045bcb79_pid42550.hprof -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh as CSD_JAVA_OPTS
Using /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER as conf dir
Using scripts/control.sh as process script
Date: Mon Jul 17 14:32:49 CDT 2017
Host: svd0hdatn03.XXXX.corp
Pwd: /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER
CONF_DIR: /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER
KAFKA_HOME: /opt/cloudera/parcels/KAFKA-2.2.0-1.2.2.0.p0.68/lib/kafka
Zookeeper Quorum: svd0hdatn01.XXXX.corp:2181,svd0hmstn01.XXXX.corp:2181,svd0hmstn02.XXXX.corp:2181
Zookeeper Chroot:
PORT: 9092
JMX_PORT: 9393
SSL_PORT: 9093
ENABLE_MONITORING: true
METRIC_REPORTERS: nl.techop.kafka.KafkaHttpMetricsReporter
BROKER_HEAP_SIZE: 1024
BROKER_JAVA_OPTS: -server -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark -XX:+DisableExplicitGC -Djava.awt.headless=true
BROKER_SSL_ENABLED: false
KERBEROS_AUTH_ENABLED: false
KAFKA_PRINCIPAL: kafka/svd0hdatn03.XXXX.corp@XXXX.CORP
SECURITY_INTER_BROKER_PROTOCOL: INFERRED
AUTHENTICATE_ZOOKEEPER_CONNECTION: true
SUPER_USERS: kafka
Kafka version found: 0.10.2-kafka2.2.0
ZK_PRINCIPAL_NAME: zookeeper
Final Zookeeper Quorum is svd0hdatn01.XXXX.corp:2181,svd0hmstn01.XXXX.corp:2181,svd0hmstn02.XXXX.corp:2181
security.inter.broker.protocol inferred as PLAINTEXT
LISTENERS=listeners=PLAINTEXT://svd0hdatn03.XXXX.corp:9092,
Mon Jul 17 14:32:57 CDT 2017
JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
Using -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/kafka_kafka-KAFKA_BROKER-c1b2741bf1f3e232c6c549ec045bcb79_pid43296.hprof -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh as CSD_JAVA_OPTS
Using /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER as conf dir
Using scripts/control.sh as process script
Date: Mon Jul 17 14:32:57 CDT 2017
Host: svd0hdatn03.XXXX.corp
Pwd: /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER
CONF_DIR: /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER
KAFKA_HOME: /opt/cloudera/parcels/KAFKA-2.2.0-1.2.2.0.p0.68/lib/kafka
Zookeeper Quorum: svd0hdatn01.XXXX.corp:2181,svd0hmstn01.XXXX.corp:2181,svd0hmstn02.XXXX.corp:2181
Zookeeper Chroot:
PORT: 9092
JMX_PORT: 9393
SSL_PORT: 9093
ENABLE_MONITORING: true
METRIC_REPORTERS: nl.techop.kafka.KafkaHttpMetricsReporter
BROKER_HEAP_SIZE: 1024
BROKER_JAVA_OPTS: -server -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+CMSScavengeBeforeRemark -XX:+DisableExplicitGC -Djava.awt.headless=true
BROKER_SSL_ENABLED: false
KERBEROS_AUTH_ENABLED: false
KAFKA_PRINCIPAL: kafka/svd0hdatn03.XXXX.corp@XXXX.CORP
SECURITY_INTER_BROKER_PROTOCOL: INFERRED
AUTHENTICATE_ZOOKEEPER_CONNECTION: true
SUPER_USERS: kafka
Kafka version found: 0.10.2-kafka2.2.0
ZK_PRINCIPAL_NAME: zookeeper
Final Zookeeper Quorum is svd0hdatn01.XXXX.corp:2181,svd0hmstn01.XXXX.corp:2181,svd0hmstn02.XXXX.corp:2181
security.inter.broker.protocol inferred as PLAINTEXT
LISTENERS=listeners=PLAINTEXT://svd0hdatn03.XXXX.corp:9092,
... View more
07-17-2017
12:55 PM
I have a kerberized CDH 5.9 cluster which has been running fine for a month now since securing it. I installed Kafka through Parcels, and then went to add the service to CM, and I've been running into an error while starting the brokers. All three broker nodes are complaining because they cannot access /var/run/scm-cloudera-agent/process/*-kafka-KAFKA_BROKER/supervisor.conf, but the start script seems to create a new folder with supervisor.conf in it every time. The file is set to 600 permissions with the owner as root, so its no wonder permission is denied, but I don't understand how this is supposed to work or what I'm missing that is causing it to not work in this case. I'm including a snippet of the stderr log (the whole log was too long to fit in this post) from one of the brokers below. Note that the permissions error comes up 4 times on each broker's log. + for i in '`seq 1 $COUNT`'
+ SCRIPT=/opt/cloudera/parcels/KAFKA-2.2.0-1.2.2.0.p0.68/meta/kafka_env.sh
+ PARCEL_DIRNAME=KAFKA-2.2.0-1.2.2.0.p0.68
+ . /opt/cloudera/parcels/KAFKA-2.2.0-1.2.2.0.p0.68/meta/kafka_env.sh
++ KAFKA_DIRNAME=KAFKA-2.2.0-1.2.2.0.p0.68
++ export KAFKA_HOME=/opt/cloudera/parcels/KAFKA-2.2.0-1.2.2.0.p0.68/lib/kafka
++ KAFKA_HOME=/opt/cloudera/parcels/KAFKA-2.2.0-1.2.2.0.p0.68/lib/kafka
+ echo 'Using /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER as conf dir'
+ echo 'Using scripts/control.sh as process script'
+ replace_conf_dir
+ find /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER -type f '!' -path '/var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER/logs/*' '!' -name '*.log' '!' -name '*.keytab' '!' -name '*jceks' -exec perl -pi -e 's#{{CMF_CONF_DIR}}#/var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER#g' '{}' ';'
Can't open /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER/supervisor.conf: Permission denied.
+ make_scripts_executable
+ find /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER -regex '.*\.\(py\|sh\)$' -exec chmod u+x '{}' ';'
+ export COMMON_SCRIPT=/usr/lib64/cmf/service/common/cloudera-config.sh
+ COMMON_SCRIPT=/usr/lib64/cmf/service/common/cloudera-config.sh
+ chmod u+x /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER/scripts/control.sh
+ exec /var/run/cloudera-scm-agent/process/1464-kafka-KAFKA_BROKER/scripts/control.sh start
+ DEFAULT_KAFKA_HOME=/usr/lib/kafka
+ KAFKA_HOME=/opt/cloudera/parcels/KAFKA-2.2.0-1.2.2.0.p0.68/lib/kafka
+ MIN_KAFKA_MAJOR_VERSION_WITH_SSL=2
... View more
Labels: