Member since
07-25-2018
17
Posts
4
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4222 | 06-19-2019 04:00 PM |
06-20-2019
11:15 AM
I agree that is super confusing
... View more
06-19-2019
04:42 PM
1 Kudo
Hi Mogiking, On the version I was testing (CDH 6.2) the duration looks correct, but the queries do not show up until they are completely closed. I was able to force this from the Impala Daemon Web UI, or by waiting, or by closing hue. -Andrew
... View more
06-19-2019
04:04 PM
Hi vaccarinicarlo, In the hadoop world where different components may have different rules about cases sensitivity, it may be best to do as Alex Behm said above: "It's just easier to accept one canonical casing". I agree with you that it might be better to issue more warnings when anythign other than lower case is used.
... View more
06-19-2019
04:00 PM
1 Kudo
If you want to use Statement then I think you just use simple: Statement statement = connection.createStatement(); statement.execute("drop table foo"); I know in other DBMS there is a lot of optimiazation around jdbc PreparedStatements and batches. There has not be a lot of emphasis on this in Impala. As I mention in my other reply there is a tendency to use other means of ingestion for optimal perforamance.
... View more
06-19-2019
03:49 PM
Where is your data that you want to query with Impala? Many people write their data directly into hdfs. It might be written as a text or csv file. You can access the files via an external table. That might be enough for a simple case, but if optimal query performance is required you could create a parquet table by selecting from the external table. If the data is in another database then you could use sqoop to import the data, If the newer data is going to be continually updated in hdfs then you could consider this pattern: https://blog.cloudera.com/blog/2019/03/transparent-hierarchical-storage-management-with-apache-kudu-and-impala/
... View more
06-17-2019
05:33 PM
On that Cloudera Manager page, you can click 'Select Attributes' and check the box by 'Duration'. Now any queries that are displayed will show the query duration in milliseconds.
... View more
06-17-2019
05:17 PM
Hi Punshi When you insert into parquet like this through Impala, you will create 10,000 small files in hdfs, each with 100 rows. That won't be an efficient way to write data, and it will be very inefficient to read. You should probably review https://www.cloudera.com/documentation/enterprise/latest/topics/impala_parquet.html#parquet and https://www.cloudera.com/documentation/enterprise/latest/topics/impala_perf_cookbook.html to see other ways to ingest your data. -Andrew
... View more
02-08-2019
05:30 PM
Hi Ragn, I think this - Metadata load started: 1.387ms (1.387ms) - Metadata load finished. loaded-tables=1/1 load-requests=43 catalog-updates=844: 15m43s (15m43s) suggests you are right that this is a metadata problem. When you invalidate the metadata, I assume you use INVALIDATE METADATA [[db_name.]table_name] Do you specify the table? If you don't then Impala will load all the metadata. If you specify a table name, only the metadata for that one table is flushed and synced with the HMS, which would be quicker, I think -Andrew
... View more
02-08-2019
05:09 PM
Hi Yasmin, there's not much information there but it does sound like your HS2 can't connect to the HMS. If you didn't fix the space issues for HMS then maybe HMS isn't running well. You may find useful information in the HMS logs -Andrew
... View more
02-07-2019
04:59 PM
1 Kudo
It looks like the Postgres system used by HMS is out of disk space. -Andrew
... View more