- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Hive aggregate query failing for External table
- Labels:
-
Apache Hive
Created ‎02-17-2016 12:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a Hadoop cluster(HDP 2.2) set-up in Eucalyptus environment. I have created an external table in Hive(0.14), using the below query:
CREATE EXTERNAL TABLE tempbatting (col_value STRING) LOCATION 's3n://hive-bucket/';
I'm using a custom S3 location, so I have set jets3t property in Hive configuration directory as below:
set s3service.https-only = true; set s3service.s3-endpoint = s3-customlocation.net; set s3service.s3-endpoint-http-port = 80; set s3service.s3-endpoint-https-port = 443; set s3service.disable-dns-buckets = true; set s3service.enable-storage-classes = false;
Though I'm able to execute simple select queries on the table successfully, the aggregate queries are failing. Below are the logs:
Error: java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to hive-bucket.s3.amazonaws.com:443 timed out at org.apache.http.conn.ssl.SSLSocketFactory.connectSocket(SSLSocketFactory.java:416)
From the logs, the map-reduce job seems to access Amazon S3. I have tried using the set command for Hive(set fs.s3n.endpoint=s3-customlocation.net), but it didn't seem to work. Is there a way to specify custom end-point?
Created ‎02-22-2016 12:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎02-22-2016 12:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎02-26-2016 06:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Though have not yet upgraded to Hadoop 2.7, I made the configuration changes for s3a as per the documentation. On executing Hive create query, I got the below exception:
FAILED: AmazonClientException Unable to execute HTTP request: Connect to hive-bucket.s3.amazonaws.com:443 timed out
Created ‎03-22-2016 06:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have upgraded to Hadoop 2.7 now. I have done configurations changes for s3a and the queries are executing successfully. Thank you.

- « Previous
-
- 1
- 2
- Next »