Reply
Explorer
Posts: 13
Registered: ‎07-08-2015

Cloudera Live: Exercise 2 Error

Hi, 

While trying to execute the Cloudera Live: Exercise 2 example SQL query in Impala using Hue , I am getting the below error.

 

Your query has the following error(s):

AnalysisException: Failed to load metadata for table: default.order_items CAUSED BY: TableLoadingException: Problem reading Avro schema at: hdfs://208.113.123.213/user/examples/sqoop_import_order_items.avsc CAUSED BY: ConnectException: Call From g5157-cldramaster-01/10.98.189.5 to 208.113.123.213:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused CAUSED BY: ConnectException: Connection refuse

 

Please help in resolving this issue. Thanks!

Cloudera Employee
Posts: 435
Registered: ‎07-12-2013

Re: Cloudera Live: Exercise 2 Error

Looks like a failure to connect to the HDFS NameNode. Can you log back into
Cloudera Manager and see if the HDFS service is healthy? I suspect it is
not, but we'll need to look in CM and see exactly what's wrong...

Explorer
Posts: 13
Registered: ‎07-08-2015

Re: Cloudera Live: Exercise 2 Error

HDFS service is healthy. Not sure what went wrong.
Explorer
Posts: 13
Registered: ‎07-08-2015

Re: Cloudera Live: Exercise 2 Error

Only health issue I see is with HBase. Not sure if it is relevant in any way here. 

 

Screen Shot 2015-09-16 at 10.02.50 PM.png

Cloudera Employee
Posts: 435
Registered: ‎07-12-2013

Re: Cloudera Live: Exercise 2 Error

HBase is not required for this tutorial, so it's not started by default.
You can ignore it in this context.

Going to be a bit tougher to debug if CM says all is well. The information
shown like IP addresses / hostnames all seems correct given what you posted
for step 1. And you shouldn't be hitting any firewall-related issues (the
network ACL on the VPC should be the only firewall in play, and the traffic
between machines shouldn't be passing through it).

If you're still seeing this error when running that command (and check that
- maybe it was just a transient issue that CM already corrected), can you
run `sudo lsof -i | grep hdfs | grep LISTEN` (and if lsof isn't installed,
you can install it with `sudo yum install -y lsof`). That should list any
ports that are open by processes run by the HDFS user. I'd like to confirm
that port 8020 (which might be represented as "intu-ec-svcdisc" or
something else, because of the mapping done via the /etc/services file) is
indeed open on all interfaces.

I also notice that it's showing the private IP as the source of the
connection, and the public as the target. I don't recall if that's normal -
I'll need to dig into that. Regardless, it shouldn't be triggering firewall
issues...

Explorer
Posts: 13
Registered: ‎07-08-2015

Re: Cloudera Live: Exercise 2 Error

[ Edited ]

This error doesn't seem to be transient. Tried it many times since noon and repeatedly getting the same error message.


Also, tried `sudo lsof -i | grep hdfs | grep LISTEN` command and port 8020 is listening.Below was my output.

[root@g5157-cldramaster-01 ~]# sudo lsof -i | grep hdfs | grep LISTEN
java 33597 hdfs 149u IPv4 182417 0t0 TCP g5157-cldramaster-01:50090 (LISTEN)
java 33653 hdfs 139u IPv4 182410 0t0 TCP g5157-cldramaster-01:50070 (LISTEN)
java 33653 hdfs 156u IPv4 182592 0t0 TCP g5157-cldramaster-01:oa-system (LISTEN)
java 33653 hdfs 166u IPv4 182596 0t0 TCP g5157-cldramaster-01:intu-ec-svcdisc (LISTEN)
[root@g5157-cldramaster-01 ~]#

I didn't get exactly what you meant by private IP is source and public IP is target. Can you please elaborate?

Highlighted
Cloudera Employee
Posts: 435
Registered: ‎07-12-2013

Re: Cloudera Live: Exercise 2 Error

[ Edited ]

I misspoke. I thought this was on AWS, but I see now it's on GoGrid. I think the issue is that HDFS is listening on "g5157-cldramaster-01". If you ping that hostname from your machines, does it resolve to a 208.* IP address, or a 10.* IP address? The SQL tables are set up using the 208.* (public) IP address, but I think g5157-cldramaster-01 is bound to the 10.* (private) IP address. I think the tutorial should have used the hostname in the CREATE TABLE statements. I'll need to look into what might go wrong for it not to.

 

For now, what I would recommend is dropping the tables from the Impala app using the SQL statements below (this should leave the data files Sqoop created in place and just make Hive / Impala forget about the tables as previously created). Then rerun the CREATE TABLE statements but using the hostname g5157-cldramaster-01 instead of the IP address to refer to the Avro schema files.

DROP TABLE categories;
DROP TABLE customers;
DROP TABLE departments;
DROP TABLE orders;
DROP TABLE order_items;
DROP TABLE product; 

That should get the tables the way they should be - but they shouldn't have been wrong in the first place. I'll look into what might've happened...