Member since
06-02-2017
8
Posts
2
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5099 | 04-03-2019 01:22 PM | |
4414 | 03-14-2018 02:18 PM |
04-05-2019
09:59 AM
Another thing worth mentioning about Impala bulk inserts into Kudu: you may benefit from the /* +noshuffle,noclustered */ insert hint, or from setting a MEM_LIMIT, as outlined here: https://www.cloudera.com/documentation/enterprise/6/6.2/topics/impala_hints.html
... View more
04-03-2019
01:22 PM
1 Kudo
I suspect the reason that your timestamps are off by one hour is that Impala stores timestamps in Kudu as UTC (stored in Impala as a 96 bit int with nanosecond precision) converted to/from unix time (stored in Kudu as a 64 bit int with microsecond precision). So you should be able to solve that issue by treating all timestamps in your application as UTC. The discussion in this thread may be useful: https://lists.apache.org/thread.html/bb4ef37c88e76959399f40c7053a76b644217e76664982a60c703c7e@%3Cuser.impala.apache.org%3E For performance, I'm interested in more details: if you're doing something like 'insert into <kudu_table> values (...)' to insert a few rows at a time through Impala, then you'll definitely get better performance by going through the Kudu API, as going through Impala you pay extra cost for query parsing and planning, etc. Impala is more suited to doing things like 'insert in <kudu_table> select * from <some_hdfs_table>' And as Hao pointed out, there is overhead going through Impala because of the conversion from 96 bit UTC to 64 bit unix time, so you may want to make the Impala type a bigint and only convert to/from timestamps when necessary
... View more
03-14-2018
02:18 PM
1 Kudo
You can see the number of queued/admitted queries and mem used per pool by inspecting the logs, eg. impalad.INFO. Look for lines that come from admission-controller.cc By default each time a query is admitted we log these stats for the query's pool. If the default log level doesn't provide enough info, there's additional info that gets logged at higher levels, though of course keep in mind that more logging may slow queries down.
... View more
03-13-2018
03:41 PM
Its difficult to say what might be going on without more information, but a few pieces of info that may be helpful to you: - "Unspecified GSS error" generally indicates an issue with Kerberos authentication. - The bitness of the ODBC driver must match the bitness of the client application in order for everything to work correctly. From your description, it may be that the user is using a 32bit Kerberos client, in which case it is expected that the 64bit ODBC driver would not work with it. Given that the 32bit connection is fine, what's the motivation for trying to get the 64 bit driver to work?
... View more
03-13-2018
03:03 PM
Impala has a webui, by default run on port 25000 of each impalad node, that you may find useful. In particular, the /queries page will display a list of running or recent queries along with the resource pool they were assigned to
... View more
06-02-2017
01:28 PM
It looks like you may be experiencing the same issue as: https://community.cloudera.com/t5/Batch-SQL-Apache-Hive/Hive-JDBC-client-error-when-connecting-to-Kerberos-Cloudera/td-p/30829 Can you try the solution shown there?
... View more