Member since
02-20-2015
21
Posts
9
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
658 | 08-16-2017 05:34 PM | |
1165 | 10-10-2016 09:55 AM |
07-13-2018
11:18 AM
In that case, you could try "refresh <partition>" and see the peak JVM memory usage on the Catalogds and the Impalads and if it is close to hitting OOM, increase the -Xmx [1]. Also, from our experience, using incremental stats with high refresh load can quickly trigger OOM issues. So, better not to rely on them. [1] https://community.cloudera.com/t5/Interactive-Short-cycle-SQL/Catalogd-OutOfMemoryError-Java-heap-space/td-p/41177
... View more
07-13-2018
10:57 AM
- In general, the suggestion is to run as few 'refresh'es as possible. So I'd suggest running a 'refresh <table>' every 20 mins than running a 'refresh <partition>' every 2 mins if that meets your application SLAs. - Each refresh triggers a spike in working memory on the Catalog (and the Impalads) due to thrift serialization (and deserialization on the Impalads) and the cost is constant irrespective of whether you run a refresh on the table or the partition since we serialize (and deserialize) the whole table object. This can be costly if the partitioned table is huge or with lots of files and blocks. (https://issues.apache.org/jira/browse/IMPALA-3127) - FWIW, these load operations are much faster (~2x on secure and ~5x on insecure) starting CDH5.14 due to performance enhancements.
... View more
08-29-2017
02:01 PM
1 Kudo
@BellRizz Adding to Tim's comment, we have seen this warning pop up if there are some issues with Catalog server connecting to datanodes to get block locations (either DNs are busy/not -responding for some reason etc.). This usually goes away when the load on DN is low. Might be worth checking this scenario as well (Usually something is logged into the Catalog server logs around the time this error occurs). The later versions of Impala (shipped with CDH 5.12 and later) has a new way of fetching these block locations and has a lower likelyhood of this warning.
... View more
08-16-2017
05:34 PM
This is a known bug [1] fixed in the upcoming 2.10.0 release. [1] https://issues.apache.org/jira/browse/IMPALA-5657
... View more
03-03-2017
09:08 AM
2 Kudos
Yes the persistent UDFs feature is included in 5.8.0. Also, Java UDFs require a special syntax to make them persistent. You can find examples here. https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_create_function.html#create_function Regular way of creating Impala Java UDFs, will wipe them off from the Catalog after a restart. This is done so to be consistent with Hive.
... View more
10-10-2016
09:55 AM
Hey, This looks like a bug and can be reproduced even on the latest versions of Impala. Thanks for sharing the repro steps with us. I created a jira https://issues.cloudera.org/browse/IMPALA-4266 with a simpler UDF so its easy to follow. Your UDF implementation looks fine and is likely not causing this issue. - Bharath
... View more
09-30-2016
09:09 AM
Do you have some comments in your script right before the insert stmt? If yes, you could be running into a known bug https://issues.cloudera.org/browse/IMPALA-2195 where we incorrectly print "Fetched" rows instead of "Inserted" rows as the parser doesn't detect its an insert stmt.
... View more
07-04-2016
04:47 AM
Oh, I didn't know that. May be a bug like Skye mentioned in the other thread. Thanks for looking it up.
... View more
07-04-2016
03:16 AM
I don't think (1) is possible but you can set "--insert_inherit_permissions=true" for (2). Can you give that a try?
... View more
06-13-2016
12:59 AM
Thanks for checking. The kvno.s and principals look fine to me. - Can you confirm that the OS and kerberos client libs are same on all these nodes or are they different? (lsb_release -a, rpm -qa | grep krb...) - Are you able to run any other services on 02 and 04 like datanode etc? Is the issue only with Impala?
... View more
06-08-2016
10:19 PM
Can you please check the kvno. of the principals in the failing hosts match with the kvno. of the principal in the KDC? Also, do you see any difference in the output of "klist -kt <path_to_impala.keytab>" on working and non-working hosts? especially in the KVNO. section? In the above pasted output, I only see it for non-working hosts where its 1, Is it the same for working hosts too?
... View more
05-31-2016
02:07 AM
Sorry for the inconvenience here. I think the errors messages should've been better to aid debugging. Can you paste the contents of the catalog-server startup till it fails to obtain tgt? Also did you try manually kinit'ing with the keytab and catalog principal and make sure it works? Whats the output of "klist -kt /etc/impala/conf/impala.keytab" ?
... View more
04-20-2016
10:14 AM
Thanks for clarifying. How are you creating the table? Can you paste the sql here?
... View more
04-20-2016
09:52 AM
Does the URI contain capital case letters? If yes, it could be IMPALA-2695, fixed in CDH 5.5.2.
... View more
04-05-2016
10:45 PM
2 Kudos
Yep, try using the TCompactProtocol to deserialize (Default is TBinaryProtocol). Initialize the deserializer like, new TDeserializer(TCompactProtocol.Factory())
... View more
04-05-2016
10:42 AM
1 Kudo
Sorry I missed this. Impala prepends the query hash to the base64 string while logging to the file. In your case that is 4c4d52afea4a40a1:802034563d1cfc93. Just remove it from dataCompressed string and it should work.
... View more
03-31-2016
03:42 AM
What version of sentry are you running? Do you have some stray jars somewhere in the classpath that Catalog can be picking up?
... View more
03-31-2016
02:46 AM
Why TZlibTransport? Can try java.util.zip.InflaterInputStream, something like InflaterInputStream in =
new InflaterInputStream(new ByteArrayInputStream(compressedProfile));
ByteArrayOutputStream out = new ByteArrayOutputStream();
IOUtils.copy(in, out);
out.toByteArray();
... View more
03-30-2016
07:54 AM
1 Kudo
Hey Venky, Its a zlib compressed TRuntimeProfileTree object. You might want to use the thrift bindings to decode it. HTH. - Bharath
... View more
02-07-2016
04:05 AM
This issue is being tracked at IMPALA-1822. We don't have a fix for it right now. May be you can workaround this using a cron job?
... View more
02-07-2016
03:55 AM
2 Kudos
You are most likely hitting IMPALA-2154. The first gz file might have had multiple streams and when you repacked it, it became a single streamed one. We are working on a fix for this issue currently. Please follow the jira for updates.
... View more