About elserj

elserj · ‎01-10-2017

When executing Step 3 of the Ambari installation wizard "Confirm Hosts", Ambari will (by default) SSH to each node and start an instance of the Ambari Agent process. In some cases, it is possible that the local RPM database is corrupted and this registration process will fail. The error message in Ambari would look something like: INFO:root:Executing parallel bootstrap ERROR:root:ERROR: Bootstrap of host myhost.mydomain fails because previous action finished with non-zero exit code (1) ERROR MESSAGE: tcgetattr: Invalid argumentConnection to myhost.mydomain closed. STDOUT: Error: database disk image is malformed Error: database disk image is malformedDesired version (2.5.0.0) of ambari-agent package is not available. tcgetattr: Invalid argumentConnection to myhost.mydomain closed. In this case, the local RPM database is malformed and all actions to alter the installed packages on the system will fail until the database is rebuilt. This can be done by the following commands as root on the host reporting the error: [root@myhost ~] # mv /var/lib/rpm/__db* /tmp [root@myhost ~] # rpm --rebuilddb Then, click the "Retry Failed Hosts" button in Ambari and the registration should succeed.

elserj · ‎01-05-2017

It would appear that your DataNode is failing which is the cause of the other services failing. It also appears that you have not changed the default hdfs-site.xml configuration that controls where DataNodes store their data on the local filesystem. It is not uncommon for operation systems to wipe the /tmp directory (on boot). Perhaps you have experienced this and need to re-format your HDFS? Change dfs.datanode.data.dir, dfs.namenode.name.dir, and dfs.namenode.checkpoint.dir, then format HDFS $ hdfs namenode -format Beware: Formatting HDFS is a destructive operation. Do not perform this operation unless all of the data in HDFS is stored elsewhere or can be generated.

elserj · ‎01-03-2017

So I guess you didn't have hostname resolution set up correctly as you said below? 🙂 But in general, yes, all host advertisements done by HBase are done using hostnames and not IP addresses. This is essentially a prerequisite to get Kerberos authentication working.

elserj · ‎01-03-2017

Did you inspect the extra logging at the client side? It looks like you have only copied the HBase master server logs. Also, what ports did you verify via telnet?

elserj · ‎01-03-2017

Please re-read the description on HBASE-14729. There no code-changes made by that JIRA issue -- it was closed as a duplicate of https://issues.apache.org/jira/browse/HBASE-14223 which is still outstanding.

elserj · ‎12-24-2016

I would guess the problem does not lie in your client and ZooKeeper, but your client and HBase. Remember that one use of ZooKeeper is for discovering HBase servers. I would verify that the service ports for HBase (e.g. 16000, 16020) are bound to an external network interface (*not* lo or 127.0.0.1) using netstat and that you can connect to these ports remotely using telnet as you did. 16000 is the RPC port for the Master and 16020 is the RPC for the RegionServer. Another option to get more debug information is to increase the log verbosity to DEBUG via log4j in your client for the org.apache.hadoop.hbase and org.apache.phoenix packages. This should give you more information about what actions the client is taking and why they are failing.

elserj · ‎12-20-2016

Hah, yes, it seems like you have a port conflict problem. You could use a tool like netstat to find what process has already bound the port 60020, e.g. `sudo netstat -nape | fgrep 60020`. You can find the pid of the process which has that port bound. Once you identify the other process, you can determine if there is a port conflict which needs to be changed via configuration. One important note is that 60020 is in the Ephemeral port range which means that there may be transient sockets binding that port. If you do not see any service bound on that port now, this is likely what happened. You can try to just restart the AMS in this case. This is the reason that HBase default ports moved from 600xx to 160xx in recent versions.

elserj · ‎12-20-2016

@ARUN Can you please share the error that you see? I assume this is from the AMS log files.

elserj · ‎12-19-2016

Yes, the default is "true". That's why I stated my reply in the way I did. As long as you are not setting the property to false, the (default) value would be true, and thus the table would not be disabled in the process.

elserj · ‎12-17-2016

No, you are incorrect, Sami. HBase knows the set of column families. It does *not* track the set of qualifiers.

Online	Offline
Last Visited	‎07-01-2022 02:44 PM

Member Since	‎07-17-2019 08:58 AM
Last Visited	‎07-01-2022 02:44 PM
Posts	738
Kudos received	429

Cloudera Community

Re: Why can't Object Stores like Amazon S3 be used...

Re: Not a host:port pair: PBUF, how to resolve?

Re: versioning question in hbase

Re: Phoenix query call from java on larger data se...

Re: Revoke permissions to a superuser on Hbase

Ambari host registration failure due to RPM databa...

Re: HDP Cluster Issue : hiveserver2, Datanode and ...

Re: Unable to connect to apache phoenix remotely u...

Re: Unable to connect to apache phoenix remotely u...

Re: HBase master fails to start

Re: Unable to connect to apache phoenix remotely u...

Re: ambari metrics collector going down

Re: ambari metrics collector going down

Re: What the commands "enable_table_replication" o...

Re: displaying the columns in a column family