Member since
07-08-2018
4
Posts
0
Kudos Received
0
Solutions
04-20-2017
02:48 AM
Thanks Ed! Appreciate your help. Ended up using the following which is almost similar to yours: SELECT REGEXP_EXTRACT((get_json_object(line,'$.result._raw')),'X_Requested_With=(\\w.+?)\\s',1)as X_INF_RequestID FROM tablename;
... View more
04-18-2017
05:57 PM
Part of my JSON File: X_INF_RequestID=ashg7665hjh-as09-0ahgshd5767@ER X_Requested_With=XMLHttpRequest Little background of data: Dataset used: 1.5GB (189276 records) Input HIVE table: JSON format with gzip compressed Partitioned: Yes, it is partitioned and 21 partitions have been created for the above data. I have 2 concerns here: 1. I'm using get_json_object and regexp_extract to extract the data for X_INF_RequestID and X_Requested_With.
When I run the select query on the above dataset, the X_INF_RequestID is taking 26 secs to complete, but the X_Requested_With is taking way too long to complete (~1.5hrs). Is there anything that my regex is taking such a long time?
Query used: SELECT REGEXP_EXTRACT((get_json_object(line, '$.result._raw')),'(.*)X_Requested_With=(\\w(.+?)\\s)',2) as X_Requested_With FROM tablename; SELECT REGEXP_EXTRACT((get_json_object(line, '$.result._raw')),'(.*)X_INF_RequestID=(\\w(.+?)\\s)',2) as X_INF_RequestID FROM tablename;
Note: One observation from the data is the X_Requested_With is a combination of upper and lower case alphabets. How do I ignore case in Hive?
2. My blocksize is 128MB, which implies that for 1.5GB of data there should be 12 mappers running. But for every select query above it always triggers 21 mappers (which is equal to the number of partitions). Is the any reason behind this? Any help on this is greatly appreciated. Thanks in advance!
... View more
Labels:
- Labels:
-
Apache Hive
10-19-2016
04:18 PM
@Pardeep It's been a while of the post. I have been facing the same issue after I upgraded my hortonworks sandbox to HDP2.5. I tried the "mv VERSIONS VERSIONS.org" and restarted the hue but didn't work. I get the below when I restart My HUE shows up below when re-started Any help on this is greatly appreciated. Thanks!
... View more