Support Questions

Find answers, ask questions, and share your expertise

Impala Catalogue server down after upgrading from 5.11.0 to 5.11.1

avatar
Expert Contributor

We have upgraded to 5.11.1 and now we are not able to run any impala queries. 

 

Error: 

Query: show databases

ERROR: AnalysisException: This Impala daemon is not ready to accept user requests. Status: Waiting for catalog update from the StateStore.

 

Statestore logs: 

 

I0706 12:54:32.296458 28189 authentication.cc:427] Successfully authenticated principal impala/cba24uu.abc.cdb.com@ABC.CDB.COM on an internal connection

I0706 12:54:32.296932 28401 statestore.cc:381] Registering: catalog-server@cba24uu:26000

I0706 12:54:32.297024 28401 statestore.cc:404] Subscriber 'catalog-server@cba24uu:26000' registered (registration id: 16404957b6105e9d:7340f75c059dbe95)

I0706 12:54:32.310817 28156 status.cc:114] Couldn't open transport for cba24uu:23020 (authorize: cannot authorize peer)

    @           0x8394e9  (unknown)

    @           0xdac876  (unknown)

    @           0xdacb92  (unknown)

    @           0xa505ab  (unknown)

    @           0xa50b83  (unknown)

    @           0xb36d62  (unknown)

    @           0xb39c4e  (unknown)

    @           0xb400b6  (unknown)

    @           0xbdcd09  (unknown)

    @           0xbdd6e4  (unknown)

    @           0xe2717a  (unknown)

    @     0x2b5ed7b36aa1  start_thread

    @     0x2b5ed7e3493d  clone

I0706 12:54:32.310847 28156 thrift-client.cc:67] Unable to connect to cba24uu:23020

I0706 12:54:32.310878 28156 statestore.cc:696] Unable to send heartbeat message to subscriber catalog-server@dig24au:26000, received error: Couldn't open transport for cba24uu:23020 (authorize: cannot authorize peer)

I0706 12:54:32.316840 28144 status.cc:114] Couldn't open transport for cba24uu:23020 (authorize: cannot authorize peer)

 

If i try to telnet to host and port it works. 

 

Catalogue logs:

 

I0707 09:37:11.706931 17577 thrift-server.cc:391] Command '/var/run/cloudera-scm-agent/process/2951-impala-CATALOGSERVER/altscript.sh sec-0-ssl_private_key_password_cmd' executed successfully, .PEM password retrieved
I0707 09:37:11.713904 17577 thrift-server.cc:449] ThriftServer 'StatestoreSubscriber' started on port: 23020s
I0707 09:37:11.714009 17577 statestore-subscriber.cc:203] Registering with statestore
I0707 09:37:11.801826 17577 statestore-subscriber.cc:169] Subscriber registration ID: 664bb584455ec4bf:a5fd7f54e1e7009f
I0707 09:37:11.801847 17577 statestore-subscriber.cc:207] statestore registration successful
I0707 09:37:11.803041 17577 catalogd-main.cc:91] Enabling SSL for CatalogService
I0707 09:37:11.830278 18039 thrift-util.cc:111] TAcceptQueueServer: Caught TException: No more data to read.
I0707 09:37:11.830605 17997 HdfsTable.java:1105] Fetched partition metadata from the Metastore: mssql_polybase.sample_data
I0707 09:37:11.833709 18039 thrift-util.cc:111] TAcceptQueueServer: Caught TException: No more data to read.
I0707 09:37:12.144228 17577 thrift-server.cc:391] Command '/var/run/cloudera-scm-agent/process/2951-impala-CATALOGSERVER/altscript.sh sec-0-ssl_private_key_password_cmd' executed successfully, .PEM password retrieved

I0707 09:37:12.151124 17577 thrift-server.cc:449] ThriftServer 'CatalogService' started on port: 26000s
I0707 09:37:12.151144 17577 catalogd-main.cc:96] CatalogService started on port: 26000
I0707 09:37:12.232126 17997 TableLoader.java:97] Loaded metadata for: mssql_polybase.sample_data
I0707 09:37:12.846177 18039 thrift-util.cc:111] TAcceptQueueServer: Caught TException: No more data to read.
I0707 09:37:13.858829 18039 thrift-util.cc:111] TAcceptQueueServer: Caught TException: No more data to read.
I0707 09:37:14.869678 18039 thrift-util.cc:111] TAcceptQueueServer: Caught TException: No more data to read.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

This issue is resolved after adding the hostname flag and restarted the cluster. 

thank you guys. 

View solution in original post

16 REPLIES 16

avatar
New Contributor

We are facing the same issue. Any help?

 

avatar
Expert Contributor

We currently identified this issue with the impala certificate. we are now looking into it. 

 

1. Check cert

openssl s_client -connect $hostname:$port -CAfile /abc/hadoop/cloudera-certs/impala-SAN.pem

 

2. Run hostname -f (This must give you the FQDN)

avatar
Champion
I think a new "feature" was added in this latest release. We hit the same issue. We have SSL enabled for Impala with Kerberos. SSL worked for other services like the UIs and Impalad subscribing to the statestore, but catalogd continued to fail to subscribe to the statestore with the same error.

Cloudera Support kindly pointed out that it wasn't trying to communicate using the FQDN; just the hostname. They provided this information. We applied the change and Impala is operational again.

Cloudera Manager > Impala > Configurations>

For Catalog > Catalog Server Command Line Argument Advanced Configuration Snippet (Safety Valve)

For StateStore > Statestore Command Line Argument Advanced Configuration Snippet (Safety Valve)

[configure]
--hostname=hostname.example.com

avatar
Champion
I did this test and I was able to connect to both the statestore and catalogd over SSL, but this was because I was using the FQDN (hostname -f). The issue is that CatalogD and the Statestore are using the short name post upgrade for the statestore subscription. This feels like a bug was introduced or possible this was the intended behavior and it was "fixed", but now you need this configuration setting to get SSL for Impala to work.

Cloudera please fix the code or update the Impala SSL docs to reflect the need for this setting.

avatar
Super Collaborator

@mbigelow - Thank you for keeping the JIRA updated - I'm glad you found the solution through support. It looks like you are hitting a bug in CM and we are working on fixing it. I will reach out to our documentation team to point out this issue in the docs and the release notes of 5.11.1. I'm sorry for the troubles this has caused you.

avatar
Super Collaborator

After more investigation I found that this is already documented as a Known Issue in CM: Known Issues and Workarounds in Cloudera Manager 5

 

For Impala I opened IMPALA-5631 to explain the problem and possible solutions in the docs.

avatar
Champion
@Lars Volker Thanks for adding this bit of info. I was looking at IMPALA-5631 as a suspect but never thought to look at CM.

Lesson Learned: pay as much attention to CM release notes as I do CDH release notes.

avatar
Champion

@desind  Could you let me know the details of the Operating system things like Version  , name  kernel version . 

Curious to know . 

avatar
Expert Contributor

After looking at https://www.cloudera.com/documentation/enterprise/release-notes/topics/cm_rn_known_issues.html#conce...

 

To workaround this issue, upgrade to one of the following versions of Cloudera Manager before upgrading CDH:

  • 5.10.2
  • 5.8.6

It does not mention 5.11.1 . so does this issue surface when using CM 5.11.1 ?