Member since
04-08-2019
37
Posts
1
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2638 | 10-11-2020 11:53 PM |
12-16-2020
04:52 AM
Hello Team, We are using Impala 3.2.0 there are some queries which were working 3-4 weeks back and now they have suddenly started throwing "IllegalStateException: null" exception. In last 3-4 nothing has changed on our Impala cluster. Error logs also does not suggest the cause of the error. Can anybody help us as to why this issue arising. Query which failing looks something like this WITH banner_test AS (WITH layout_id AS
(
SELECT <some columns>
FROM table_v1
WHERE
( layout_component_id = <some_id>)
),
layouts AS
(SELECT <some columns>,
coalesce() AS col1,
<some columns>
FROM table_v2 lch
WHERE lch.some_id IN
(SELECT some_id
FROM table_v3
WHERE some_field_v1 IN
(SELECT some_field from layout_id))
AND lch.start_time >= '2019-01-01T00:00:00.000+00:00'
),
full_data AS
(SELECT lctd.some_field_v1,
l.*,
lctd.some_field_v2,
min(l.start_time)over(PARTITION BY some_field_v1) AS min_start_time,
max(l.end_time)over(PARTITION BY some_field_v1) AS max_end_time
FROM tables_v4 l
INNER JOIN tables_v5 lctd ON l.some_id = lctd.some_id),
tables_v5 AS
(
SELECT <some_fields>,
case when some_field like 'val1%' then 'val1' else 'val2' end as business
FROM tables_v6
WHERE date_str >= '2019-01-01'
)
SELECT <some_fields>,
case when x3.some_field=x2.some_field then x3.some_field end as some_field,
x3.some_id as x3.some_id
FROM
(SELECT DISTINCT some_filed,
more_fields
FROM tables_v7 sve
INNER JOIN tables_v8 lfd ON sve.field1 = lfd.field2
AND sve.type = sve.type
AND coalesce(sve.field,"somestring") = lfd.field
AND sve.created_at BETWEEN lfd.min_start_time AND coalesce(lfd.max_end_time,'2030-12-31T00:00:00.000+00:00')
INNER JOIN tables_v9 aau ON sve.some_id = aau.some_id
WHERE aau.some_id IS NOT NULL) x1
INNER JOIN tables_v10 ild on ild.hash_value=x1.hash_value
LEFT JOIN
(SELECT DISTINCT some_filed,
<some_fields>
FROM tables_v10 sve
INNER JOIN full_data lfd ON sve.some_id = lfd.some_id)
x2 ON x1.some_id_1 = x2.some_id_1
AND x1.created_at < x2.created_at
and ild.lcid = x2.id
LEFT JOIN
(SELECT <some_fileds>
FROM table_v11 where status='SUCCESS')
x3 ON x2._id = x3._id
AND x2.created_at < x3.checked_out_at )
SELECT
field_1,
field_2,
field_3,
TO_DATE(FROM_UTC_TIMESTAMP(started_date ,'Asia/Bangkok')) AS started_date_1,
TO_DATE(FROM_UTC_TIMESTAMP(ended_date ,'Asia/Bangkok')) AS ended_date_1,
(((DATEDIFF(FROM_UTC_TIMESTAMP(banner_ended ,'Asia/Bangkok'), '1970-01-04')%7 + 7)%7 - 1 + 7)%(7)) AS ended_day_of_week_index_1,
DAYNAME(FROM_UTC_TIMESTAMP(banner_ended ,'Asia/Bangkok')) AS _ended_day_of_week_1,
_present_on_screen_id AS _test_screen_id_1,
count(distinct _test.x2_session_id)/count(distinct _test.x1_session_id) AS test_ctr_sessions_1,
count(distinct _test.x2_anonymous_id)/count(distinct _test.x1_anonymous_id) AS test_ctr_aid_1
FROM banner_test
WHERE ((_test.banner_started >= TO_UTC_TIMESTAMP('1900-01-01 00:00:00.000000','Asia/Bangkok'))) AND (banner_test.region REGEXP '$') AND (banner_test.business REGEXP '$')
GROUP BY 1,2,3,4,5,6,7,8
ORDER BY _banner_started_date_1 DESC
LIMIT 500; I have obfuscated some of the details here but information regarding joins, aggregations, group by etc is preserved. Regards Parth
... View more
Labels:
- Labels:
-
Apache Impala
12-15-2020
09:07 AM
@Tim Armstrong If we implement the Admission controls then we can reduce the memory exceptions, but we can still encounter the situations where queries are not admitted. With admission controls and resource we can prioritise that queries from a certain query pool to get the resources first, please correct me if I am wrong. And w.r.t scheduling In Impala we are reading data from Kudu. Impala and Kudu services both are located on different nodes. So how does scheduling work in this case? Parth
... View more
12-14-2020
12:26 AM
Hello Team, We have 5 node Impala cluster (1 co-ordinator and 4 executors) we are running Impala 3.2.0. Each Impala node is of size 32 GB and 4 cores. Now we are facing an issue sometimes 2-3 Impala executors out of 5 are over utilised (80 - 90 %) memory usage and other are not, for example if executor 1 and 3 have memory usage of more than 80% and some new query issued it fails saying could allocate space(512MB) on executor 1 even tough there is more than enough memory on the other executors (2,4 and 5) whose memory utilisation is under 20%. Following is the error which I receive Memory limit exceeded: Failed to allocate row batch EXCHANGE_NODE (id=148) could not allocate 16.00 KB without exceeding limit. Error occurred on backend impala-node-executor-1:22000 Memory left in process limit: 19.72 GB Query(2c4bc52a309929f9:2fa5f79d00000000): Reservation=998.62 MB ReservationLimit=21.60 GB OtherMemory=104.12 MB Total=1.08 GB Peak=1.08 GB Unclaimed reservations: Reservation=183.81 MB OtherMemory=0 Total=183.81 MB Peak=398.94 MB Fragment 2c4bc52a309929f9:2fa5f79d00000021: Reservation=0 How are query fragments distributed among the impala executors? Is a way to load balance the query load among executors in case when we have dedicated executor and co-ordinator? What are the good practices to have proper utilisation of Impala cluster? Regards Parth
... View more
Labels:
- Labels:
-
Apache Impala
10-15-2020
04:13 AM
@Tim Armstrong it worked like charm after changing the gcc version. Thanks
... View more
10-11-2020
11:53 PM
@Tim Armstrong I am using gcc version 5.4.0 and OS is ubuntu 16.04 xenial. Will it work If I compile it with 4.9.2?
... View more
10-08-2020
07:47 PM
Hello Team, We are planning to Impala UDFs and UDAs. To try things out we started by exploring the examples given in the cloudera github repo. After building the .so file using the make utility when I try to create the function using the following create function has_vowels (string) returns boolean location '/user/hive/udfs/libudfsample.so' symbol='HasVowels'; We are getting the following error ERROR: AnalysisException: Could not load binary: /<hdfs_path>/udfs/libudfsample.so
Unable to load /var/lib/impala/udfs/libudfsample.9909.1.so
dlerror: /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/impala/lib/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /var/lib/impala/udfs/libudfsample.9909.1.so) Following are the error logs org.apache.impala.common.AnalysisException: Could not load binary: /user/parth.khatwani/udfs/libudfsample.so
Unable to load /var/lib/impala/udfs/libudfsample.9909.1.so
dlerror: /opt/cloudera/parcels/CDH-6.3.2-1.cdh6.3.2.p0.1605554/lib/impala/lib/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by /var/lib/impala/udfs/libudfsample.9909.1.so)
at org.apache.impala.catalog.Function.lookupSymbol(Function.java:442)
at org.apache.impala.analysis.CreateUdfStmt.analyze(CreateUdfStmt.java:92)
at org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:451)
at org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:421)
at org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1285)
at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1252)
at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1222)
at org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:167)
I1009 02:42:52.638576 15136 status.cc:124] 3c48433373076c77:30eff6ff00000000] AnalysisException: Could not load binary: /user/parth.khatwani/udfs/libudfsample.so
Unable to load /var/lib/impala/udfs/libudfsample.9909.1.so I am unable to figure out what's wrong here. Regards Parth
... View more
Labels:
- Labels:
-
Apache Impala
05-13-2020
04:53 AM
@Tim Armstrong thanks for detailed insights this will be very helpful.
... View more
04-30-2020
06:24 AM
1 Kudo
Hello Team, We are using Impala to query data stored as parquet on s3. This has been an awesome feature. Recently Amazon S3 has announced a new feature called S3 Select which helps in speeding up the column projection when querying data stored on S3. As of now Hive and Presto support S3 select push down. Is Impala going to support S3 select push downs? Parth
... View more
Labels:
- Labels:
-
Apache Impala
04-14-2020
05:52 AM
@Tim Armstrong thanks for pointing this out. We have observed that in our case the memory usage on the coordinator is not that high having co-ordinator of same size as executor will lead to under utilisation of resources on co-ordinator. Or we can have multiple (8) executors of smaller size lets say 32 GB instead of two with 128GB. Please share your thoughts about it
... View more
- « Previous
-
- 1
- 2
- Next »