Created on 01-03-2015 08:47 AM - edited 09-16-2022 02:17 AM
Hello,
started the go-grid cluster tutorial.
First the mysql-rights did not work for the SQOOP tutorial, seems the host-specification for user 'retail_dba'@'%' did not work. So I explicitly added the ip-adresses of agents. Sqoop then worked fine, data loaded into HDFS.
Then loaded the AVRO metadata into HIVE, this seemed to work fine.
When running the IMPALA tutorial, the tables show up after "invalidate metadata" command.
But any query or anything else returns exceptions:
AnalysisException: Failed to load metadata for table: default.categories
CAUSED BY: TableLoadingException: Problem reading Avro schema at: hdfs://216......... /user/examples/sqoop_import_categories.avsc
CAUSED BY: ConnectException Call From f6129-cldramaster-01/10.115........ to 216..........:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more detail see: ......... [etc etc etc]
Tried to run the IMPALA-SHELL , but this resulted with same error:
Caused by: com.cloudera.impala.catalog.TableLoadingException: Problem reading Avro schema at: hdfs://216.121.116.82/user/examples/sqoop_import_categories.avsc CAUSED BY: ConnectException: Call From f6129-cldramaster-01/10.105.115.2 to 216.121.116.82:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused CAUSED BY: ConnectException: Connection refused
Please help, and push me in right direction, or solution to above problem!
(I'm curious as to if the DNS services are working correctly)
Created 05-31-2015 07:58 AM
Hi,
I've solved the issue jsut by executing the following commands in the hive console:
DROP TABLE IF EXISTS categories;
DROP TABLE IF EXISTS customers;
DROP TABLE IF EXISTS departments;
DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS order_items;
DROP TABLE IF EXISTS products;
CREATE EXTERNAL TABLE categories
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/categories'
TBLPROPERTIES ('avro.schema.url'='hdfs://if3f8-cldramaster-01/user/examples/sqoop_import_categories.avsc');
CREATE EXTERNAL TABLE customers
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/customers'
TBLPROPERTIES ('avro.schema.url'='hdfs://if3f8-cldramaster-01/user/examples/sqoop_import_customers.avsc');
CREATE EXTERNAL TABLE departments
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/departments'
TBLPROPERTIES ('avro.schema.url'='hdfs://if3f8-cldramaster-01/user/examples/sqoop_import_departments.avsc');
CREATE EXTERNAL TABLE orders
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/orders'
TBLPROPERTIES ('avro.schema.url'='hdfs://if3f8-cldramaster-01/user/examples/sqoop_import_orders.avsc');
CREATE EXTERNAL TABLE order_items
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/order_items'
TBLPROPERTIES ('avro.schema.url'='hdfs://if3f8-cldramaster-01/user/examples/sqoop_import_order_items.avsc');
CREATE EXTERNAL TABLE products
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/products'
TBLPROPERTIES ('avro.schema.url'='hdfs://if3f8-cldramaster-01/user/examples/sqoop_import_products.avsc');
Thanks for all the suggestions from the community!
Created 01-06-2015 06:53 AM
Thanks for letting us know about this - this is an error in a recent update to the tutorial. Those commands should be using the hostname, rather than the IP address. So I'd suggest trying 'f6129-cldramaster-01' for the NameNode instead of 216.121.116.82. We'll change the future MySQL setup to allow access via IP address as well, but it seems you found a work-around for that. The reason for the second failure is that the command is trying to use the public interface instead of the private interface.
GoGrid machines typically have 2 network interfaces - one that is publicly accessible, and one that is private (but has higher performance). 216.* points to one of the public IP addresses, but some of the services that are not intended to be accessed directly only listen on the private interface (Hue and CM listen on the public interfaces as well). So again, using the hostname there should work, or the IP address of the internal interface (eth1). We'll get the tutorial content and MySQL config updated shortly... Please post back if you run into additional issues with the current tutorial and I'll try provide workarounds...
Created 01-15-2015 12:31 PM
I'm getting this same error using the hostname instead of the ip address. As I'm new to this I'm stuck at the moment getting through the basic tutorial.
our query has the following error(s):
AnalysisException: Failed to load metadata for table: default.order_items CAUSED BY: TableLoadingException: Problem reading Avro schema at: hdfs://quickstart.cloudera/user/examples/sqoop_import_order_items.avsc CAUSED BY: FileNotFoundException: File does not exist: /user/examples/sqoop_import_order_items.avsc at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1878) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1819) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1771) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:527) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:85) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:356) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) CAUSED BY: RemoteException: File does not exist: /user/examples/sqoop_import_order_items.avsc at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1878) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1819) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1771) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:527) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:85) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:356) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
Created 03-24-2015 02:47 PM
Sean,
This might be a dumb question but how do you access Hue from the hostname? When I try entering it into a browser such as:
http://g2316-cldramaster-01:8888/
It doesnt find the server. I'm assuming the DNS doesn't know what to look for.
Any assistance would be very much appreciated.
Tom
Created 03-24-2015 03:46 PM
I figured that one out. Just had to add this host to the HOST file in Windows. However, still unable to run any queries in Impala...
Created 03-25-2015 08:17 AM
Thanks - got that !
For all who have similar issue - pls do following
DROP TABLE IF EXISTS categories;
DROP TABLE IF EXISTS customers;
DROP TABLE IF EXISTS departments;
DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS order_items;
DROP TABLE IF EXISTS products;
......then exit your hive shell, and do 'hostname -i'
replace the IP address to the one you see from above command.
Should be fine.
Created 05-12-2015 06:28 AM
Hi Sean,
I don't quite understand what you are suggesting If I use the hostname or the internal IP, of course I can't access the server from outside (since both are known only inside the subnet). Or do you suggest to access Hue using the internal IP through some sort of SSH tunnel ?
Thanks in advance for your answer.
Created on 05-28-2015 08:26 AM - edited 05-28-2015 08:27 AM
The original error has nothing to do with connecting to Hue. It was an error interactions between Impala and the other daemons it needs to complete the query. Just use the link in the email or on the guidance page to connect to Hue. The public IP should work just fine. If it doesn't, you are experiencing a separate issue and would need to provide more information.
Created 05-31-2015 07:58 AM
Hi,
I've solved the issue jsut by executing the following commands in the hive console:
DROP TABLE IF EXISTS categories;
DROP TABLE IF EXISTS customers;
DROP TABLE IF EXISTS departments;
DROP TABLE IF EXISTS orders;
DROP TABLE IF EXISTS order_items;
DROP TABLE IF EXISTS products;
CREATE EXTERNAL TABLE categories
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/categories'
TBLPROPERTIES ('avro.schema.url'='hdfs://if3f8-cldramaster-01/user/examples/sqoop_import_categories.avsc');
CREATE EXTERNAL TABLE customers
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/customers'
TBLPROPERTIES ('avro.schema.url'='hdfs://if3f8-cldramaster-01/user/examples/sqoop_import_customers.avsc');
CREATE EXTERNAL TABLE departments
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/departments'
TBLPROPERTIES ('avro.schema.url'='hdfs://if3f8-cldramaster-01/user/examples/sqoop_import_departments.avsc');
CREATE EXTERNAL TABLE orders
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/orders'
TBLPROPERTIES ('avro.schema.url'='hdfs://if3f8-cldramaster-01/user/examples/sqoop_import_orders.avsc');
CREATE EXTERNAL TABLE order_items
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/order_items'
TBLPROPERTIES ('avro.schema.url'='hdfs://if3f8-cldramaster-01/user/examples/sqoop_import_order_items.avsc');
CREATE EXTERNAL TABLE products
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION 'hdfs:///user/hive/warehouse/products'
TBLPROPERTIES ('avro.schema.url'='hdfs://if3f8-cldramaster-01/user/examples/sqoop_import_products.avsc');
Thanks for all the suggestions from the community!
Created 06-08-2015 06:46 AM