Member since
08-20-2015
23
Posts
7
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6202 | 12-10-2015 01:21 PM |
08-03-2016
01:38 PM
We have an interesting situation with the MapReduceBatchIndexer tool where sometimes while the job finishes successfully, the indexes are not actually loaded into Solr via the live merge. The logs are too verbose to stick in this thread, but at the end the job says this. If we run it an un-determinate number of times more, then it will eventually work. 82774 [pool-4-thread-1] INFO org.apache.solr.hadoop.GoLive - Live merge hdfs://nameservice1/tmp/solredh_admin_user/results/part-00000 into http://mapls188.bsci.bossci.com:8983/solr ... 83073 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Succeeded with job: jobName: org.apache.solr.hadoop.MapReduceIndexerTool/MorphlineMapper, jobId: job_1470234528819_0230 83073 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Success. Done. Program took 83.07273 secs. Goodbye. Here is a snapshot of the current run. You can see that the Solr results are in our temp location. -bash-4.1$ hadoop fs -ls /tmp/solredh_admin_user/results/part-00000/data/index Found 12 items -rwxrwxr-x+ 3 edh_admin_user supergroup 8797979 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0.fdt -rwxrwxr-x+ 3 edh_admin_user supergroup 1379 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0.fdx -rwxrwxr-x+ 3 edh_admin_user supergroup 1248 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0.fnm -rwxrwxr-x+ 3 edh_admin_user supergroup 44950 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0.nvd -rwxrwxr-x+ 3 edh_admin_user supergroup 61 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0.nvm -rwxrwxr-x+ 3 edh_admin_user supergroup 350 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0.si -rwxrwxr-x+ 3 edh_admin_user supergroup 2218199 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0_Lucene41_0.doc -rwxrwxr-x+ 3 edh_admin_user supergroup 1356123 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0_Lucene41_0.pos -rwxrwxr-x+ 3 edh_admin_user supergroup 8914976 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0_Lucene41_0.tim -rwxrwxr-x+ 3 edh_admin_user supergroup 113325 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/_0_Lucene41_0.tip -rwxrwxr-x+ 3 edh_admin_user supergroup 53 2016-08-03 20:02 /tmp/solredh_admin_user/results/part-00000/data/index/segments_1 -rwxrwxr-x+ 3 edh_admin_user supergroup 131 2016-08-03 20:03 /tmp/solredh_admin_user/results/part-00000/data/index/segments_2 But after the job finishes successfully, the only files are segments_1, segments_2 and this funny looking lock file. -bash-4.1$ hadoop fs -ls /solr/F0116/core_node1/data/index Found 3 items -rw-r--r-- 3 solr solr 0 2016-08-03 20:01 /solr/F0116/core_node1/data/index/HdfsDirectory@24d078d lockFactory=org.apache.solr.store.hdfs.HdfsLockFactory@3d20687c-write.lock -rwxr-xr-x 3 solr solr 53 2016-08-03 20:01 /solr/F0116/core_node1/data/index/segments_1 -rwxr-xr-x 3 solr solr 82 2016-08-03 20:03 /solr/F0116/core_node1/data/index/segments_2 We have enabled both DEBUG on the MapReduceBatchIndexer and the Solr server and have compared successful runs with none-successful runs without any luck identifying why sporadically this works and doesn't work. Anyone seen something like this before?
... View more
07-14-2016
12:22 PM
Thanks.
... View more
07-08-2016
05:58 AM
Good day. We have an environment that was originally undersized with a single JBOD mount of 1TB. When it filled up, we went ahead and added four additional 1TB mounts on each host. The behavior in CM we are getting is that the HDFS 'DataNode Data Directory' is red because of the single mount being past the threshold. We don't think rebalancing will work as we understand that to be at the data node level, rather than at the volume level. At least that seems to be the behavior the last time we tried it. Is there a way to possibly adjust this, or modify the monitor to look at the aggregate storage rather than a single volume? The following DataNode Data Directory are on filesystems with less than 5.0 GiB of their space free. /cloudera/data/04/dfs/dn (free: 2.4 GiB (0.24%), capacity: 1,007.8 GiB) This role's DataNode Data Directory (/cloudera/data/02/dfs/dn, /cloudera/data/05/dfs/dn, /cloudera/data/1/dfs/dn, /cloudera/data/03/dfs/dn) are on a filesystem with more than 10.0 GiB of its space free /dev/sdc1 1008G 459G 499G 48% /cloudera/data/1 /dev/sdb1 1008G 454G 503G 48% /cloudera/data/02 /dev/sde1 1008G 448G 509G 47% /cloudera/data/03 /dev/sdf1 1008G 956G 1.3G 100% /cloudera/data/04 /dev/sdg1 1008G 448G 509G 47% /cloudera/data/05
... View more
04-21-2016
12:50 PM
Good day. We are looking for some options to audit download operations from the Impala or Hive query editor in Hue. The compliance group is looking for some data on tracking who is moving results out of our Hadoop install. We are running CDH 5.5 with Auditing enabled for Hue, but can't seem to find an 'operation' to indicate a download via Hue via Navigator. We havn't had it enabled for very log, but we actually don't get much on Hue. So far just USER_LOGIN. Looking outside of Navigator, the closest we can get is looking at the Hue logs and looking at the request entries. Any thoughts on if something is on the Navigator road map to help us with this? Or any other suggestions on how to audit this behavior outside of the Hue logs? GET /impala/download/709/csv
... View more
03-30-2016
12:54 PM
Good day. We recently lowered our Hue Session timeout (i.e. ttl) to be 10 minutes per our security team's recommendation. One of the things we've found out is that the cookie is not extended during activity. That is, if a user logs in and works for 10 minutes, they will be forced to log in after 10 minutes - no mater what. Is there any way, or thoughts for the future, to modify this behavior so its 10 minutes of 'inactivity'. I know defining 'inactivity' is hard to do, but wanted to check and see if anyone had thoughts on this.
... View more
03-11-2016
06:44 AM
We have a MS SQL Server database that has what we refer to as extended ascii characters in it. We also refer to them internally as special characters. The database collation is set to SQL_Latin1_General_CP1_CI_AS and the data type is of “Text”. When we pull these data over to Hadoop vis Sqoop, we end up with black diamonds with question marks mixed into the data. Here are examples of what the data look like in SQL server, Impala/Hive editors in Hue and the Impala Shell. Notice the diamonds with question marks mixed into the data on the Hadoop side. What we’re thinking is happening is that somehow we’re not successfully telling Sqoop that the character set we’re pulling in is not UTF-8. Or something like that. We’re not 100% sure what this is doing, but we’ve tried to look for something like this MySQL setting ‘characterEncoding=UTF-8’, but haven’t found anything similar on the MS SQL JDBC connection string. Any advice of things we should look at? MS SQL SERVER Hue Impala/Hive Editors Impala Shell
... View more
02-23-2016
06:51 AM
Good morning. We are writing a custom HTML page for our Search Index to present one of the indexed fields as a hyperlink. The users want to just click on file links we present to them in the result set so the only way we could find to do this is to use the HTML widget and then use that to wrap the field with a <a href="file://{{doc_link}}">{{doc_path}}</a> entry. This seems to be working ok, but one of the things we lost was the header values. We're trying to get to something like this. Document Number Document Link 12345 \\thenas\myfile.pdf 67890 \\thenas\myfile.doc However, what we end up getting is this. Document Number Document Link 12345 \\thenas\myfile.pdf Document Number Document Link 67890 \\thenas\myfile.doc Document Number Document Link .... Does anyone know of a way to insert the header values in the result, but just have them listed out at the top? Another thing we thought would be a nice feature is having some control over the HTML rendered in the grid widget. However, that does not look to be a feature. Just a thought though if anyone else had any feedback on adding that at some point.
... View more
02-12-2016
01:05 PM
Thanks!
... View more
02-03-2016
12:10 PM
Thanks. Indeed that is exactly what we are looking for. I realize this is maybe something you don't know, but any idea if Impala 2.5.0 might be put in a 5.5.* release, or if this might be back ported to a 2.3 release of Impala with CDH 5.5.*? We're at 5.4.7 right now and I've been looking for opportunities to the team to buy off on 5.5. so just checking to see if this might be an arrow in my life-cycle management quiver. If you don't know, no worries.
... View more
02-03-2016
11:36 AM
Good afternoon, We are making use of the unix_timestamp() function to take different dates in string format and getting them to a timestamp. Unfortunately the date format we are getting in via CSV files is coming in as something like 1/2/2007 where the month and date are not padded with a ‘0’ for single digit days or months. For Hive, it looks like not having the padding works fine. #hive select unix_timestamp('01/02/2007', 'MM/dd/yyyy'); // returns 1167696000 select unix_timestamp('1/2/2007', 'MM/dd/yyyy'); // returns 1167696000 But in Impala if the format is not padded with ‘0’ then we get null. #impala select unix_timestamp('01/02/2007', 'MM/dd/yyyy'); // returns 1167696000 select unix_timestamp('1/2/2007', 'MM/dd/yyyy'); // returns null And if we were to change the format to just single ‘d’ or ‘M’ it works for the non-padded sources, but obviously returns null for the two digit formats. #impala select unix_timestamp('1/2/2007', 'M/d/yyyy'); // returns 1167696000 select unix_timestamp('10/20/2007', 'M/d/yyyy'); // returns null One work around we do in other areas is basically ingest the data and then use Hive to transform the data into a new table. However we have a use case currently where we would like to put Impala directly on top of the CSV file and not do a transformation. We think what we’ll end up doing is see if the creater of the CSV can create the dates with padded ‘0’s, or just treat them as strings. Before going down this route, we wanted to see if the observations we are noticing in Impala is expected, or if we’re missing anything. Thanks in advance, Mac
... View more
01-11-2016
12:08 PM
Thanks Ben. That did it! Much appreciated.
... View more
01-11-2016
07:52 AM
Hi, We are looking into setting up HDFS Snap Shots for our 5.4 cluster. Unfortunately when going into Cloudera Manager (CM) - HDFS - File Browser, we are getting the following errors in our DEV and TEST environment. PROD works fine surprisingly enough. While the error messages are slightly different, in both cases the keytab files referenced do not exist when we look for them. Anything you can advise us to look at further to try and figure out these errors? We thought about regenerating the hdfs/ourdevhost.ourcompany.com@OURCOMPANY.COM credentials (CM – Administration – Kerberos – Credentials) but thought we'd ask before doing so. We can't remember the exact details, but I think we've had some issues with this in the past where when regenerating it fails out when trying to create the Active Directory account again because it already exists. We could be off on this so it's why we wanted to check and get a second opinion. - DEV: Keytab file does not exist. java.io.IOException: Login failure for hdfs/ourdevhost.ourcompany.com@OURCOMPANY.COM from keytab /tmp/142360606-0/hadoop7884985294202953.keytab Reload root directory. - TEST: Keytab file does not exist. java.io.IOException: Failed on local exception: java.io.IOException: Login failure for hdfs/ourtesthost.ourcompany.com@OURCOMPANY.COM from keytab /tmp/144736658-0/hadoop5666268298612442343.keytab; Host Details : local host is: "ourtesthost.ourcompany.com/10.10.104.167"; destination host is: "ourtesthost.ourcompany.com":8020
... View more
12-10-2015
01:21 PM
As it so often happens, I went out for a walk and came back to look at a few other things. And sure enough, I now see this would be how you tune the job. I guess if any good can come from my lack of attention to detail, at least I now have it engraved in my mind. sqoop import -D mapreduce.map.memory.mb=4096 -D mapreduce.map.java.opts=-Xmx3000m ....
... View more
12-10-2015
12:55 PM
We have a Sqoop 1 query that is throwing a " Error: Java heap space" message on both our Sqoop driver and the Map/Reduce jobs running under Yarn. We were able to increase the Sqoop driver heap by setting HADOOP_HEAPSIZE to 2MB and that has solved the initial issue. It looks like the way the scripts work, you just need to pass in the megabytes and the scrpit prefixes -Xms and adds 'm' at the end. export HADOOP_HEAPSIZE=2000 sqoop import ...... However, we can't find the correct place to set what we presume is the container memory and actual task process heap size configuration. Our cluster is currently configured with the following settings for Yarn. These are set via Cloudera Manager and are stored in the mapred-site.xml file. We don't want to adjust the entire cluster setting as these work fine for 99% of the jobs we run. We just have one problem child that we'd like to tune. mapreduce.map.memory.mb=1024 mapreduce.map.java.opts=-Djava.net.preferIPv4Stack=true -Xmx825955249 We have tried the following without any luck. Is there any other suggestions for where we should be configuring these two settings for Sqoop 1 initiated jobs? export HADOOP_OPTS="-Dmapreduce.map.memory.mb=2000 -Dmapreduce.map.java.opts=-Xmx1500m" export HADOOP_CLIENT_OPTS="-Dmapreduce.map.memory.mb=2000 -Dmapreduce.map.java.opts=-Xmx1500m" export YARN_OPTS="-Dmapreduce.map.memory.mb=2000 -Dmapreduce.map.java.opts=-Xmx1500m" export YARN_CLIENT_OPTS="-Dmapreduce.map.memory.mb=2000 -Dmapreduce.map.java.opts=-Xmx1500m" sqoop import ....
... View more
12-03-2015
09:15 AM
3 Kudos
This took us a bit to figure out, so wanted to share in case anyone else runs into the issue. It's probably documented someplace, but we ended up needing to dig through the ‘com.cloudera.impala.hivecommon.core.HiveJDBCConnection’ code to find it. Using the document below, what we're trying to do is hook our application up via JDBC into Impala using LDAP authentication with SSL/TLS enabled. Here was our original connection string. jdbc:impala://<impalahost>:21050/;AllowSelfSignedCerts=true;AuthMech=4;SSLKeyStore=/keystores/clientkeystore;SSLKeyStorePwd=<password>;UID=<user>;PWD=<password> This tossed us the error "[Simba][ImpalaJDBCDriver](500204) Error in setting uid and password: null " We spent a while trying to figure out why our username and password were wrong without much luck. So we ended up looking into the HiveJDBCConnection class and noticed that the exception was actually because we were passing in AllowSelfSignedCerts=true. Turns out it's expecting AllowSelfSignedCerts=1 to turn this feature on. We also found that the chart on the last few pages indicates '5' is the correct value for AuthMech when you are using LDAP (username and password) with SSL. It's actually '4' as indicated earlier in the document in the examples. Anyway, no action needed. Just wanted to share and contribute back. http://www.cloudera.com/content/www/en-us/documentation/other/connectors/impala-jdbc/2-5-5/Cloudera-JDBC-Driver-for-Impala-Install-Guide-2-5-5.pdf
... View more
12-02-2015
01:34 PM
Thanks! Indeed that seemed to take care of it. Pretty slick. You guys did a very good job on making that easy for us. Now to start testing. Thanks again!
... View more
12-02-2015
01:10 PM
Good day. We are working on a project to put our Impalad nodes behind a load balancer. We are running CM/CDH 5.4.5 and using the instructions found here. http://www.cloudera.com/content/www/en-us/documentation/enterprise/5-4-x/topics/impala_proxy.html#proxy_kerberos_unique_1 From what we observed, it looks like when using CM, after entering our load balancer dns:port (e.g. our.load.balancer.company.com:25003) into the "Impala Daemons Load Balancer" section, CM does most all the hard work of merging keytabs and setting the custom command line arguments (e.g. --be_principal) for us. This is great. Unfortunately after adding in our load balancer we are running into the error: "Role is missing Kerberos keytab." in CM. Now, one of the things we were unsure of is whether CM will actually create the principal and keytab for the load balancer. For example, when looking at our CM managed Kerberos principals, there is no "impala/our.load.balancer.company.com@COMPANY.COM" Any idea if we need to pre-create and manage the 'impala/our.load.balancer.company.com@COMPANY.COM' outside of CM? Or would you expect CM to actually create that principal for us as part of adding in the “Impala Daemons Load Balancer” configuration? Thanks in advance.
... View more
11-11-2015
09:55 AM
1 Kudo
This post is for posterity purposes since we didn't find much on Google and would like to try and give back. In Sqoop we recently were getting the error 'Error: java.io.IOException: SQLException in nextKeyValue' and 'Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Invalid length parameter passed to the LEFT or SUBSTRING function.' when trying to import a specific view from SQL server. To troubleshoot this we listed out all the columns (--columns "Col1,Col2,Col3,Col4") and then removed one by one until the error stopped. That gave us the column that Sqoop was having trouble with. We then took that column and did a simple SELECT col4 from TABLE in our MS SQL query tool and sure enough, the same "Invalid length parameter" message popped up. Turns out the SQL code creating the view on the MS SQL database had a bug in it. And so it really had nothing to do with Sqoop but bubbled up to us since we apparently are the only ones issuing a query against that column. I guess the take away for us was, when running into odd SQL errors, remove Sqoop from the equation and make sure the query Sqoop is issuing runs successfully. Hope this helps someone, someday.
... View more
Labels:
11-04-2015
09:28 AM
Thanks for the reply. We're running CDH 5.4.5 which is on HBase 1.0.0 so indeed maybe we are running into those two issues - unless they were back ported into CDH. After enabling debug on the Thrift server, we see this message in the logs so seems like we're onto something with the second issue for sure. We are also having issues with the REST interface, which seems to be related. I think we're going to submit a ticket to see if we can get some help from support. I appreciate your help in pointing us to the right direction. All the best, Mac #from curl command to REST interface. -bash-4.1$ curl -X GET -i --negotiate -u : -b ~/cookiejar.txt -c ~/cookiejar.txt http://mapls189:20550/version/cluster HTTP/1.1 401 Authentication required WWW-Authenticate: Negotiate Set-Cookie: hadoop.auth=; Path=/; Expires=Thu, 01-Jan-1970 00:00:00 GMT; HttpOnly Content-Type: text/html; charset=iso-8859-1 Cache-Control: must-revalidate,no-cache,no-store Content-Length: 1408 HTTP/1.1 413 FULL head Connection: close #from Thrift logs EXCEPTION
HttpException(413,FULL head,null)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:285)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
... View more
11-04-2015
07:59 AM
Good day, hope this message finds you well. We enabled HBase a few weeks back, but just took the defaults which don't enable Authentication or Authorization. We just wanted to do a quick POC. We are now going back and implementing Authentication and Authorization as one of the POC projects has PHI data and so we want to make sure we protect that. We are using CDH/CM 5.4 and have went through the follow two articles and seem to have everything working for our application. However, we are running into a problem with Hue's HBase Browser. Authentication = http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cdh_sg_hbase_authentication.html Authorization = http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cdh_sg_hbase_authorization.html In addition to making the HBase changes, we have also went through this article and made these changes for Hue. Changes for Hue = http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/admin_hue_enable_apps.html In the HBase Browser we get the error "Api Error:" that pops up in a little red box. Below are the errors we get in Hue and the ones we get on our Thrift server. It seems to boil down to the message 'Authorization header received from the client is empty.'. It kind of looks like someone had a similar issue in the comments of this article, but I can't seem to tell what they exactly did to fix it. http://gethue.com/hbase-browsing-with-doas-impersonation-and-kerberos/ Any other things you can think of that we should be looking at? Thanks in advance, Mac #here is the error message we are getting in the Hue logs when trying to access the HBase Browswer after enabling Authentication and Authorization in HBase. Nov 4, 1:59:31 AM INFO access 10.42.63.12 NolandM - "POST /hbase/api/getTableList/HBase HTTP/1.1" Nov 4, 1:59:31 AM INFO connectionpool Resetting dropped connection: mapls189.bsci.bossci.com Nov 4, 1:59:31 AM ERROR kerberos_ handle_mutual_auth(): Mutual authentication unavailable on 413 response Nov 4, 1:59:31 AM ERROR thrift_util Thrift saw exception (this may be expected). Traceback (most recent call last): File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/desktop/core/src/desktop/lib/thrift_util.py", line 415, in wrapper ret = res(*args, **kwargs) File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/apps/hbase/src/hbase/../../gen-py/hbased/Hbase.py", line 53, in decorate return func(*args, **kwargs) File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/apps/hbase/src/hbase/../../gen-py/hbased/Hbase.py", line 832, in getTableNames self.send_getTableNames() File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/apps/hbase/src/hbase/../../gen-py/hbased/Hbase.py", line 840, in send_getTableNames self._oprot.trans.flush() File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/build/env/lib/python2.6/site-packages/thrift-0.9.1-py2.6-linux-x86_64.egg/thrift/transport/TTransport.py", line 170, in flush self.__trans.flush() File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/desktop/core/src/desktop/lib/thrift_/http_client.py", line 84, in flush self._data = self._root.post('', data=data, headers=self._headers) File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/desktop/core/src/desktop/lib/rest/resource.py", line 122, in post return self.invoke("POST", relpath, params, data, self._make_headers(contenttype, headers)) File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/desktop/core/src/desktop/lib/rest/resource.py", line 78, in invoke urlencode=self._urlencode) File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/desktop/core/src/desktop/lib/rest/http_client.py", line 161, in execute raise self._exc_class(ex) RestException: (error 413) Nov 4, 1:59:31 AM INFO thrift_util Thrift saw exception: (error 413) Nov 4, 1:59:31 AM INFO middleware Processing exception: Api Error: : Traceback (most recent call last): File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/core/handlers/base.py", line 112, in get_response response = wrapped_callback(request, *callback_args, **callback_kwargs) File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/db/transaction.py", line 371, in inner return func(*args, **kwargs) File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/apps/hbase/src/hbase/views.py", line 76, in api_router return api_dump(HbaseApi(request.user).query(*url_params)) File "/opt/cloudera/parcels/CDH-5.4.5-1.cdh5.4.5.p0.7/lib/hue/apps/hbase/src/hbase/api.py", line 54, in query raise PopupException(_("Api Error: %s") % e.message) PopupException: Api Error: ####### Here is the error we see on the Thrift server side. Nov 4, 1:59:31.238 PM WARN org.apache.hadoop.security.UserGroupInformation PriviledgedActionException as:HTTP/ourhost@ourdomain.COM (auth:KERBEROS) cause:org.apache.hadoop.hbase.thrift.HttpAuthenticationException: Authorization header received from the client is empty. Nov 4, 1:59:31.239 PM ERROR org.apache.hadoop.hbase.thrift.ThriftHttpServlet Failed to perform authentication Nov 4, 1:59:31.239 PM ERROR org.apache.hadoop.hbase.thrift.ThriftHttpServlet Kerberos Authentication failed org.apache.hadoop.hbase.thrift.HttpAuthenticationException: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.hbase.thrift.ThriftHttpServlet.doKerberosAuth(ThriftHttpServlet.java:139) at org.apache.hadoop.hbase.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:86) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:767) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1684) at org.apache.hadoop.hbase.thrift.ThriftHttpServlet.doKerberosAuth(ThriftHttpServlet.java:134) ... 16 more Caused by: org.apache.hadoop.hbase.thrift.HttpAuthenticationException: Authorization header received from the client is empty. at org.apache.hadoop.hbase.thrift.ThriftHttpServlet$HttpKerberosServerAction.getAuthHeader(ThriftHttpServlet.java:212) at org.apache.hadoop.hbase.thrift.ThriftHttpServlet$HttpKerberosServerAction.run(ThriftHttpServlet.java:176) at org.apache.hadoop.hbase.thrift.ThriftHttpServlet$HttpKerberosServerAction.run(ThriftHttpServlet.java:144) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) ... 17 more
... View more
08-27-2015
08:26 AM
1 Kudo
Thanks for the response and checking back in. I had to work on some other things for the client so finally getting back to this. We are running 5.4.3 so I used the following document under "Hue as an SSL Client", but unfortunately we're getting the same behavior. http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_sg_ssl_hue.html I admittedly didn't quite follow the exporting process of the service JKS Keystores for non-Java services (Impala in our case), so in the process of trying to see where that was at, I ended up noticing in Impala's SSL configuration a section called "SSL/TLS Certificate for Clients". This references a file '/opt/cloudera/security/x509/impala.cer' that is on each of our hosts running the Impalad process. The explanation is as follows: "Local path to the X509 certificate that will identify the Impala daemon to clients during SSL/TLS connections. This file must be in PEM format." Being it seemed to indicate it was for 'clients', and was in a PEM format (I double checked), what I did was grab this file, move it to the Hue server, set the permissions so the Hue user could see it, set REQUESTS_CA_BUNDLE=/tmp/hue-cert/impala.cer in the "Hue Service Environment Advanced Configuration Snippet (Safety Valve)" section and restarted Hue. I put it in /tmp as I'm not an elevated user on the system so had to pick a spot where I could put it. We'd have our Unix team do all this and put it in a better location once we get it figured out of course. And unfortunately this didn't seem to fix the issue. Any ideas, or possibly I'm doing something wrong? I did notice a section in Impala called "SSL/TLS Private Key for Clients" but didn't quite understand if that was for clients, or was the Private key Impala uses to unseal the package a client encrypts with the public key. Thanks in advance for your help.
... View more
08-21-2015
11:12 AM
2 Kudos
Hi. I'm posting this because I didn't find a lot of help on Google so maybe this will help others if they run across it. Moreover, could very well be this is fixed in a future release of Hive, so might end up being dated at some point - if not already. But in any sense, wanted to at least share. We are running Hive 0.13.1-cdh5.3.2. We have some very large schemas we are ingesting to Hive using the Avro file format. On the 'larger' ones, we were getting the following error when trying to create a table. Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:javax.jdo.JDODataStoreException: Put request failed : INSERT INTO TABLE_PARAMS (PARAM_VALUE,TBL_ID,PARAM_KEY) VALUES (?,?,?) NestedThrowables: org.datanucleus.store.rdbms.exceptions.MappedDatastoreException: INSERT INTO TABLE_PARAMS (PARAM_VALUE,TBL_ID,PARAM_KEY) VALUES (?,?,?) ) (state=08S01,code=1) There wasn't much here, but we did stumble upon this JIRA that while I can't tell 100% if it's a similar issue we have, it at least got us pointed in the right direction. https://issues.cloudera.org/browse/KITE-469 In any sense, when using the 'avro.schema.literal' name/value pair in Hive's TBLPROPERTIES, we get this error when the avro schema is larger than 4K. In other words, in the example below, if that is more than 4K, you'll get the error. The fix for this on our end was to simply make the Avro a separate file in HDFS and reference it via URL. The second example below is how we fixed it. Again, no action needed. Just posting for reference. --errors CREATE EXTERNAL TEST_TABLE_1 ( COL1 STRING ) STORED AS AVRO LOCATION '/archive/TEST_TABLE_1' TBLPROPERTIES ('avro.schema.literal'='{<4K of data>}'); --works CREATE EXTERNAL TEST_TABLE_1 ( COL1 STRING ) STORED AS AVRO LOCATION '/archive/TEST_TABLE_1' TBLPROPERTIES ('avro.schema.url'='/archive/avro_schemas/TEST_TABLE_1.avsc');
... View more
08-20-2015
09:38 AM
Good day, hope this message finds you well. We recently updated our TEST environment to CDH 5.4.3 and as part of that, enabled SSL for Impala. After doing so, we noticed that when connecting via impala-shell, we had to issue a --ssl parameter. If we didn't do this, Impala would show the following in our logs. "TThreadPoolServer: TServerTransport died on accept: SSL_accept: wrong version number" We are now going through Hue to test the Impala Query editor, and facing a similar issue. When we click on the Query Editor > Impala we get the infinite spinning wheel on the database load. When checking the Impala logs, we get the same message as we did when we didn't pass in --ssl for the impala-shell. "TThreadPoolServer: TServerTransport died on accept: SSL_accept: wrong version number" I noticed there was a 5.4.4 release to address an issue when SSL is enable for Hue, it wouldn't start. However, not sure if that is potentially the same issue here. We do plan on going to 5.4.4 but was just trying to get some high level verification done in advance. Anyone seen something like this before? My thoughts are that we need to somehow indicate to Hue that it too needs to talk to Impala via SSL using the equivalent of the --ssl parameter we used for our impala-shell, but i'm not seeing an option for that in Cloudera Manager's Hue Configuration section. Thanks for your thoughts in advance, Mac
... View more