Support Questions

Find answers, ask questions, and share your expertise

Hue 3.11 Hive Query editor - Not retrying thrift call GetSchemas due to socket timeout

avatar
Contributor

Hue 3.11 Hive Tez query editor database view refreshing manually times out (after 2 mins) while query is running. Also metastore manager does the same when Hive Tez session is running as well. The following warning were found right after the timeout of database view refresh. Once the query completes though refreshing the hive database/ metastore manager works just fine.

 

warning:

[Mon Nov 14 15:16:01 2016]  [14/Nov/2016 15:16:01 +0000] thrift_util  WARNING  Not retrying thrift call GetSchemas due to socket timeout
[Mon Nov 14 15:16:01 2016]  [14/Nov/2016 15:16:01 +0000] thrift_util  INFO     Thrift saw a socket error: timed out
[Mon Nov 14 15:16:01 2016]  [14/Nov/2016 15:16:01 +0000] api          WARNING  Autocomplete data fetching error: timed out (code THRIFTSOCKET): None

1 ACCEPTED SOLUTION

avatar
Rising Star

Have u tried this: hive.server2.parallel.ops.in.session=true

 

 

https://github.com/cloudera/hue/issues/515

 

Regards,

Andrey

View solution in original post

11 REPLIES 11

avatar
Super Guru
So far we don't know what changed in Hive in HDP 2.5

avatar
Contributor

Thanks Romainr! I will try few more configuration changes and let you know if I have any luck.

avatar
New Contributor

Running into this same issue, HDP 2.5.3 with Hue 3.12. Symptoms are that when viewing a table in the metastore manager the loading indicator spins and eventually it times out. After this happens both the query editor and the metastore manager hang trying to do anything. Going into the hive query editor and clicking to recreate/close the hive session temporarily resolves the issue. If I view metadata in the hive query editor before going into the metastore manager, then the table will load without hanging in the metastore manager.

 

 

Hue logs below:

 

[30/Mar/2017 07:52:46 -0500] access WARNING 10.4.201.231 hue - "GET /logs HTTP/1.1"
[30/Mar/2017 07:52:45 -0500] resource DEBUG GET Got response: {"apps":null}
[30/Mar/2017 07:52:45 -0500] connectionpool DEBUG "localhost:8088 GET /ws/v1/cluster/apps?startedTimeBegin=1490273565000&limit=1000&user=hue&finalStatus=UNDEFINED HTTP/1.1" 200 None
[30/Mar/2017 07:52:45 -0500] access INFO 10.4.201.231 hue - "POST /jobbrowser/jobs/ HTTP/1.1"
[30/Mar/2017 07:52:38 -0500] thrift_util INFO Thrift saw a socket error: timed out
[30/Mar/2017 07:52:38 -0500] thrift_util WARNING Not retrying thrift call CloseSession due to socket timeout
[30/Mar/2017 07:52:29 -0500] resource DEBUG GET Got response: {"from":0,"total":0,"sessions":[]}
[30/Mar/2017 07:52:29 -0500] connectionpool DEBUG "localhost:8998 GET /sessions HTTP/1.1" 200 34
[30/Mar/2017 07:52:29 -0500] connectionpool INFO Starting new HTTP connection (1): localhost:8998
[30/Mar/2017 07:52:29 -0500] resource DEBUG GET Got response: {"systemMode":"NORMAL"}
[30/Mar/2017 07:52:29 -0500] connectionpool DEBUG "localhost:11000 GET /oozie/v1/admin/status?timezone=EST&user.name=hue&doAs=hue HTTP/1.1" 200 23
[30/Mar/2017 07:52:29 -0500] resource DEBUG GET Got response: {"systemMode":"NORMAL"}
[30/Mar/2017 07:52:29 -0500] connectionpool DEBUG "localhost:11000 GET /oozie/v1/admin/status?timezone=EST&user.name=hue&doAs=hue HTTP/1.1" 200 23
[30/Mar/2017 07:52:29 -0500] connectionpool INFO Resetting dropped connection: localhost 11000
[30/Mar/2017 07:52:29 -0500] thrift_util DEBUG Thrift call hbased.Hbase.Client.isTableEnabled returned in 3ms: True
[30/Mar/2017 07:52:29 -0500] thrift_util DEBUG Thrift call: hbased.Hbase.Client.isTableEnabled(args=('atlas_titan',), kwargs={'doas': u'hue'})
[30/Mar/2017 07:52:29 -0500] thrift_util DEBUG Thrift call hbased.Hbase.Client.isTableEnabled returned in 13ms: True
[30/Mar/2017 07:52:29 -0500] thrift_util DEBUG Thrift call: hbased.Hbase.Client.isTableEnabled(args=('ATLAS_ENTITY_AUDIT_EVENTS',), kwargs={'doas': u'hue'})
[30/Mar/2017 07:52:29 -0500] thrift_util DEBUG Thrift call hbased.Hbase.Client.getTableNames returned in 13ms: ['ATLAS_ENTITY_AUDIT_EVENTS', 'atlas_titan']
[30/Mar/2017 07:52:29 -0500] thrift_util DEBUG Thrift call: hbased.Hbase.Client.getTableNames(args=(), kwargs={'doas': u'hue'})
[30/Mar/2017 07:52:29 -0500] api ERROR failed to load the HBase clusters Traceback (most recent call last): File "/usr/local/hue/apps/hbase/src/hbase/api.py", line 64, in getClusters full_config = json.loads(conf.HBASE_CLUSTERS.get().replace("'", "\"")) File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded
[30/Mar/2017 07:52:29 -0500] api ERROR failed to load the HBase clusters Traceback (most recent call last): File "/usr/local/hue/apps/hbase/src/hbase/api.py", line 64, in getClusters full_config = json.loads(conf.HBASE_CLUSTERS.get().replace("'", "\"")) File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded
[30/Mar/2017 07:52:29 -0500] api ERROR failed to load the HBase clusters Traceback (most recent call last): File "/usr/local/hue/apps/hbase/src/hbase/api.py", line 64, in getClusters full_config = json.loads(conf.HBASE_CLUSTERS.get().replace("'", "\"")) File "/usr/lib64/python2.7/json/__init__.py", line 338, in loads return _default_decoder.decode(s) File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode raise ValueError("No JSON object could be decoded") ValueError: No JSON object could be decoded
[30/Mar/2017 07:52:29 -0500] resource DEBUG GET Got response: {"FileStatus":{"accessTime":0,"blockSize":0,"childrenNum":5,"fileId":17241,"group":"hdfs","length":0,"modificationTime":1490726683375,"owner":"hive","pathSuffix":"","permission":"777","replication":0,"storagePolicy":0,"type":"DIRECTORY"}}
[30/Mar/2017 07:52:29 -0500] connectionpool DEBUG "hadoopm.mms.ams.local:50070 GET /webhdfs/v1/apps/hive/warehouse?op=GETFILESTATUS&user.name=hue&doas=hue HTTP/1.1" 200 None
[30/Mar/2017 07:52:29 -0500] conf ERROR The application won't work without a running HiveServer2. Traceback (most recent call last): File "/usr/local/hue/apps/beeswax/src/beeswax/conf.py", line 233, in config_validator handle = server.execute_and_wait(query, timeout_sec=10.0) File "/usr/local/hue/apps/beeswax/src/beeswax/server/dbms.py", line 693, in execute_and_wait handle = self.client.query(query) File "/usr/local/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 1123, in query return self._client.execute_async_query(query, statement, withMultipleSession=withMultipleSession) File "/usr/local/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 852, in execute_async_query withMultipleSession=withMultipleSession) File "/usr/local/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 870, in execute_async_statement (res, session) = self.call_return_result_and_session(self._client.ExecuteStatement, req, withMultipleSession=withMultipleSession) File "/usr/local/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 696, in call_return_result_and_session res = fn(req) File "/usr/local/hue/desktop/core/src/desktop/lib/thrift_util.py", line 383, in wrapper raise StructuredException('THRIFTSOCKET', str(e), data=None, error_code=502) StructuredException: timed out (code THRIFTSOCKET): None
[30/Mar/2017 07:52:29 -0500] thrift_util INFO Thrift saw a socket error: timed out
[30/Mar/2017 07:52:29 -0500] thrift_util WARNING Not retrying thrift call ExecuteStatement due to socket timeout
[30/Mar/2017 07:52:22 -0500] resource DEBUG GET Got response: {"apps":null}
[30/Mar/2017 07:52:22 -0500] connectionpool DEBUG "localhost:8088 GET /ws/v1/cluster/apps?startedTimeBegin=1490273542000&limit=1000&user=hue&finalStatus=UNDEFINED HTTP/1.1" 200 None
[30/Mar/2017 07:52:22 -0500] access INFO 10.4.201.231 hue - "POST /jobbrowser/jobs/ HTTP/1.1"
[30/Mar/2017 07:52:15 -0500] resource DEBUG GET Got response: {"apps":null}
[30/Mar/2017 07:52:15 -0500] connectionpool DEBUG "localhost:8088 GET /ws/v1/cluster/apps?startedTimeBegin=1490273535000&limit=1000&user=hue&finalStatus=UNDEFINED HTTP/1.1" 200 None
[30/Mar/2017 07:52:15 -0500] access INFO 10.4.201.231 hue - "POST /jobbrowser/jobs/ HTTP/1.1"
[30/Mar/2017 07:51:51 -0500] resource DEBUG GET Got response: {"apps":null}
[30/Mar/2017 07:51:51 -0500] connectionpool DEBUG "localhost:8088 GET /ws/v1/cluster/apps?startedTimeBegin=1490273511000&limit=1000&user=hue&finalStatus=UNDEFINED HTTP/1.1" 200 None
[30/Mar/2017 07:51:51 -0500] access INFO 10.4.201.231 hue - "POST /jobbrowser/jobs/ HTTP/1.1"
[30/Mar/2017 07:51:45 -0500] resource DEBUG GET Got response: {"apps":null}
[30/Mar/2017 07:51:45 -0500] connectionpool DEBUG "localhost:8088 GET /ws/v1/cluster/apps?startedTimeBegin=1490273505000&limit=1000&user=hue&finalStatus=UNDEFINED HTTP/1.1" 200 None
[30/Mar/2017 07:51:45 -0500] access INFO 10.4.201.231 hue - "POST /jobbrowser/jobs/ HTTP/1.1"
[30/Mar/2017 07:51:20 -0500] resource DEBUG GET Got response: {"apps":null}
[30/Mar/2017 07:51:20 -0500] connectionpool DEBUG "localhost:8088 GET /ws/v1/cluster/apps?startedTimeBegin=1490273480000&limit=1000&user=hue&finalStatus=UNDEFINED HTTP/1.1" 200 None
[30/Mar/2017 07:51:20 -0500] access INFO 10.4.201.231 hue - "POST /jobbrowser/jobs/ HTTP/1.1"
[30/Mar/2017 07:51:15 -0500] resource DEBUG GET Got response: {"apps":null}
[30/Mar/2017 07:51:15 -0500] connectionpool DEBUG "localhost:8088 GET /ws/v1/cluster/apps?startedTimeBegin=1490273475000&limit=1000&user=hue&finalStatus=UNDEFINED HTTP/1.1" 200 None
[30/Mar/2017 07:51:15 -0500] access INFO 10.4.201.231 hue - "POST /jobbrowser/jobs/ HTTP/1.1"
[30/Mar/2017 07:51:15 -0500] access INFO 10.4.201.231 hue - "GET /desktop/get_debug_level HTTP/1.1"
[30/Mar/2017 07:51:14 -0500] decorators INFO args: (True,)
[30/Mar/2017 07:51:14 -0500] decorators INFO AXES: Calling decorated function: dt_login
[30/Mar/2017 07:51:13 -0500] access WARNING 10.4.201.231 hue - "GET /logs HTTP/1.1"
[30/Mar/2017 07:51:01 -0500] thrift_util DEBUG Thrift call: <class 'TCLIService.TCLIService.Client'>.GetTables(args=(TGetTablesReq(schemaName=u'staging', sessionHandle=TSessionHandle(sessionId=THandleIdentifier(secret=d2427935cb445b52:a5db6d5d4cb2d09e, guid=6745a59887960199:9580bba32635579c)), tableName=u'policy', tableTypes=None, catalogName=None),), kwargs={})
[30/Mar/2017 07:51:01 -0500] dbms DEBUG Query Server: {'server_name': 'beeswax', 'transport_mode': 'socket', 'server_host': 'hadoopm2.mms.ams.local', 'server_port': 10000, 'auth_password_used': False, 'http_url': u'http://hadoopm2.mms.ams.local:10001/cliservice', 'auth_username': 'hue', 'principal': None}
[30/Mar/2017 07:51:01 -0500] access INFO 10.4.201.231 hue - "GET /beeswax/api/table/staging/policy HTTP/1.1"
[30/Mar/2017 07:51:01 -0500] thrift_util DEBUG Thrift call: <class 'TCLIService.TCLIService.Client'>.GetTables(args=(TGetTablesReq(schemaName=u'staging', sessionHandle=TSessionHandle(sessionId=THandleIdentifier(secret=d2427935cb445b52:a5db6d5d4cb2d09e, guid=6745a59887960199:9580bba32635579c)), tableName=u'policy', tableTypes=None, catalogName=None),), kwargs={})
[30/Mar/2017 07:51:01 -0500] dbms DEBUG Query Server: {'server_name': 'beeswax', 'transport_mode': 'socket', 'server_host': 'hadoopm2.mms.ams.local', 'server_port': 10000, 'auth_password_used': False, 'http_url': u'http://hadoopm2.mms.ams.local:10001/cliservice', 'auth_username': 'hue', 'principal': None}
[30/Mar/2017 07:51:01 -0500] access INFO 10.4.201.231 hue - "POST /notebook/api/autocomplete/staging/policy HTTP/1.1"

 

Did you figure out any settings or anything to resolve this issue? Is this specific to certain versions of Hue?

avatar
Contributor

Hi jsuffolk,

 

Unfortunately no luck so far, I even upgraded to Hue 3.12 and still didn't resolve the issues. It's good to know it's not working on HDP 2.5.3. If you have any luck please let me know. 

 

Thanks.

avatar
New Contributor

I'm happy to report that I solved the issue by comparing the code used for query editor (notebook) to the code used for the metastore manager. The issue is in the describe_table function of /apps/metastore/src/metastore/views.py. Apparently it's missing the code for setting the do_as user, adding this made all the difference. So change the first few lines of this function to the following, recompile hue and the issue will be resolved.

 

This is the corrected code:

 

I'm not sure what happened, I thought this was working but it's clear now that the problem is still alive and well...

 

def describe_table(request, database, table):
  app_name = get_app_name(request)
  query_server = get_query_server_config(app_name)
  do_as = request.user
  if (request.user.is_superuser or request.user.has_hue_permission(action="impersonate", app="security")) and 'doas' in request.GET:
    do_as = User.objects.get(username=request.GET.get('doas'))
  db = dbms.get(do_as, query_server)

 

avatar
New Contributor

After some more investigation found the true problem and solution. It boils down to a concurrency issue: when multiple thrift calls are submitted concurrently the thrift session hangs until the session is reset. I implemented a quick and dirty solution to get HUE useable, but it isn't really a good solution because it doesn't support concurrent users. The solution I'm using right now is putting a thread lock around the use of the thrift session. I suspect a better solution would be to acquire a new session every time or somehow figure out a way to have concurrent sessions per user. If anybody finds a way to do this, let me know.

 

Anyway, the hacky solution I currently have is in hive_server2_lib.py:

 

import threading

lock = threading.Lock()

And then in the def call:

 

with (lock):
  if hasattr(req, 'sessionHandle') and req.sessionHandle is None:
    req.sessionHandle = session.get_handle()
  
  res = fn(req)

 

 

avatar
Contributor

Hi Jsuffolk,

 

That's great, which files you modified and the line number? I see the two files:

 

/usr/share/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py

/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py

 

Thanks.

avatar
Rising Star

Have u tried this: hive.server2.parallel.ops.in.session=true

 

 

https://github.com/cloudera/hue/issues/515

 

Regards,

Andrey

avatar
Contributor

Hi Markovich,

 

That fixed it! Thank you so much!!