Member since
02-22-2017
32
Posts
7
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
861 | 04-03-2019 02:56 PM | |
2439 | 04-03-2019 01:11 PM | |
5100 | 12-27-2018 12:54 PM | |
5459 | 12-11-2017 04:54 PM |
11-19-2019
05:35 PM
Did you happen to copy/move the existing contents under the old directory to the new one?
... View more
10-16-2019
04:43 PM
1 Kudo
As you specified the ' kudu.master_addresse' property in the CREATE TABLE statements, Impala requires ALL privileges on SERVER. That error message also indicated that: ERROR: AuthorizationException: User 'yydp/bigdata5@TEST.COM' does not have privileges to access: server1. So you may want to either remove that clause (and set the Kudu master address via-kudu_master_hosts flag), or grant ALL on SERVER to that user.
... View more
04-04-2019
10:49 AM
You don't need to take down the other masters, but need to e stablish a maintenance window (no DDLs) to ensure the table metadata is consistent between followers and the leader. If it is hard in production, alternative is to ensure the master to be taken down is not the leader (to avoid lose data).
... View more
04-03-2019
02:56 PM
1 Kudo
The following step below should help you: take a phyciscal backup of the master you are going to take down (in case something is going wrong with the migration). stop doing any DDL to the masters to prevent data loss. take down the master you chosen on step 1. follow the doc Recovering from a Dead Kudu Master in a Multi-Master Deployment.
... View more
04-03-2019
01:11 PM
1 Kudo
When you said you were using ' new Date() ', is it java.util.Date? If so, can you try 'SimpleDateFormat' where you can set time zone by 'setTimeZone', example can be found here. I also notice the Impala doc stating that 'The conversion between the Impala 96-bit representation and the Kudu 64-bit representation introduces some performance overhead when reading or writing TIMESTAMP columns. You can minimize the overhead during writes by performing inserts through the Kudu API. Because the overhead during reads applies to each query, you might continue to use a BIGINT column to represent date/time values in performance-critical applications.' So I guess that is why you are seeing insert performance difference. And if you care about query performance as well, the other option is to use 'BIGINT' column.
... View more
04-03-2019
12:06 PM
Same question about if this issue is only happening for Kudu tables? I saw same error was solved by the workaround (use setObject instead of setString ) here. Though it seems you are not using setString.
... View more
04-02-2019
11:58 AM
Hi Rahul, The error you encountered " Service unavailable: Error reading clock. Clock considered unsynchronized " means either the NTP is not installed, or if the clock is reported as unsynchronized, see NTP Clock Synchronization for trouble shooting. Best, Hao
... View more
12-28-2018
10:03 AM
Yes, but notice that raising --flush_threshold_secs too high can affect tablet server restart time. Currently, CDH 6.2 is scheduled for March.
... View more
12-27-2018
12:54 PM
Hi huaj, It looks like you are hitting KUDU-1400, which before the fix Kudu c mpacts rowsets based on overlap and not based on other criteria like on-disk size. Unfortunately, there is no way to fix the small rowsets have been flushed. On the other hand, You can rebuild the affected tables, create new tables from this existing tables and see if that helps. Before doing that, please check this doc to see which patten usage pattern caused you to hit this issue and try to prevent that by following the recommendations. FYI, this fix for KUDU-1400 should land in the next release CDH6.2.
... View more
12-27-2018
11:37 AM
Hi Rams, If you have to use ' LATERAL VIEW EXPLODE ' then maybe other person who is more familiar with Impala can chime in, to see if there is anything equivalent or similar. Otherwise, query such as ' INSERT INTO my_kudu_table SELECT * FROM legacy_data_import_table ' should work (check https://kudu.apache.org/docs/kudu_impala_integration.html#kudu_impala_insert_bulk).
... View more
12-27-2018
11:06 AM
Hi Rams, Kudu doesn't support Hive integration yet so I would use Impala shell instead.
... View more
12-27-2018
10:55 AM
Hi Rams, I don't think you can directly import data from another cluster yet with Kudu Impala integration. One way is to export the data in Parquet files and then import the data into the other cluster.
... View more
12-26-2018
03:24 PM
Hi Rams, By looking at the error messages, it seems the KuduTableOutputFormat class is not loaded in the classpath. Can you provide more information on how you are inserting the data? Are you using Kudu-Mapreduce util tool? If so, you can follow ImportCsv.java as an example to load data from other data store.
... View more
12-26-2018
02:43 PM
Hi Rams, As of the latest version, Kudu itself doesn't have a concept of database. Are you using Kudu with Impala? If so, you can create the second table as external in Impala and use the same Kudu table under the hood. Best, Hao
... View more
11-16-2018
12:01 PM
1 Kudo
So it looks like you have already fixed the table? For a longer term solution, I suggest you to upgrade to CDH versions that have the fix for KUDU-2463, which are 5.15.2, 5.16.2, 6.1.0, when they are available.
... View more
11-08-2018
02:11 PM
Hi DanAdan, It is possible to store more than the recommendations amount of data on a tablet server. We have such recommendations because that is what have been well tested. The same for storing more than 2000 tablets on a tablet server. Some performance degradation could be 1) the server restart time gets longer, as on disk data grows. 2) as tablets accrue more data blocks, their superblocks become larger, raising the minimum amount of I/O for any operation that rewrites a superblock (such as a flush or compaction). 3) the tablet copy protocol used in rereplication tries to copy the entire superblock in one RPC message; if the superblock is too large, it'll run up against the default 50 MB RPC transfer size (see src/kudu/rpc/transfer.cc).
... View more
11-07-2018
09:38 AM
1 Kudo
Hi Andreyeff, It's possible this is an instance of KUDU-2463. Does the the following description match to your case: Are you actively writing to the table regularly? Based on the schema, are they writing to every tablet? If they are, that is evidence against the issue being KUDU-2463. When was the last time you restarted a tablet server? If the incorrect results were only noticed after a restart, that is evidence for the issue being KUDU-2463. Best, Hao
... View more
11-07-2018
09:10 AM
Hi Andreyeff, Does 'query1' always return consisten resutls? The same for 'quer2'? If you do 'query1', 'query2' now, will it return consistent results? Best, Hao
... View more
11-06-2018
05:05 PM
Hi Andreyeff, You mentioned there were 'no data modifications in between ' the queries, but do you verify if any ingesting (that hasn't finished) is still going on while you issued these queries? Since Impala is using READ_LATEST scan mode, it is possible that the scan has been taken placed in a stale replica. Also, did you run `ksck` tool to check if the cluster is in a healthy state? Best, Hao
... View more
10-09-2018
12:02 PM
Are you using Kudu to store the data? And how are you connect to that database?
... View more
06-25-2018
03:07 PM
Hi, For the warning you saw in watchdog, it looks like a kernal bug on EL 6 machines using EXT 4, which requires either upgrade to RHEL7 or to use XFS instead of EXT4. And would you mind sharing the master log when you see the slowness (from the symptom you described, one possiblity could be KUDU-2264)? Thanks! Best, Hao
... View more
05-07-2018
04:42 PM
Hi Andreyeff, 'FATAL_INVALID_AUTHENTICATION_TOKEN: Not authorized: authentication token expired' error indicates this is authn token expiration issue. However, i n CDH5.14.2, Kudu client should be able to automatically re-acquires authn token when needed. Do you know which Kudu client version you are using (CDH5.14.2)? When you launch the impala query, does the coordinator have primary credentials (i.e. Kerberos)? Best, Hao
... View more
02-16-2018
01:55 PM
Hi Rajesh, Would you please be more specific about what are the error log you saw? Thanks! Best, Hao
... View more
02-12-2018
05:45 PM
Hi Patrick, The error you have encounters is documented in troubleshooting: https://kudu.apache.org/docs/troubleshooting.html#disk_issues. It is caused by non-empty data directories on first startup, or deleted/corrupted data directories from previously-running, healthy Kudu process restarted. You can refer the doc about how to resolve it if it is the latter. Also, I do not see any reasons that this could be caused by manual Kudu installation into cloudera manager. Best, Hao
... View more
12-11-2017
04:54 PM
1 Kudo
Hi Petter, Right, based on my understanding of how Impala Kudu intergration works, if you remove TBLPROPERTIES clause (and set Kudu master address on the tservers), it won't require ALL privileges on SERVER for users to create an internal table. Let me know if it does not work. Best, Hao
... View more
12-08-2017
11:43 AM
Hi Petter, Would you mind sharing the query how you create a new table? Did you happen to set kudu master addresses in TBLPROPERTIES clause? > Will finer grained access arrive in the future? Yes, we have a jira to trap finer grained authorization intergation in Kudu, https://issues.apache.org/jira/browse/KUDU-428. It is on Kudu's roadmap. Once this is done, I believe this limitation you are hitting via impala will be relaxed. Best, Hao
... View more
12-07-2017
11:52 AM
Hi Pettax, Sorry, I missed that you are using external Kudu tables in the previous reply. If you are using 5.12, there is this limitation that " Only users with ALL privileges on SERVER may create external Kudu tables." you can check here for more reference: https://www.cloudera.com/documentation/kudu/latest/topics/kudu_impala.html#concept_mpl_zxk_mz.
... View more
12-07-2017
11:27 AM
Hi Pettax, To have a more fine-grained sentry privilege setup, you can grant "ALL" (or "Select","Insert", whichever suits your use case) privilege to the group the user ('my_user') belongs to. Also, I think this doc may be helpful for you to get more context of how to use Sentry, https://www.cloudera.com/documentation/enterprise/latest/topics/cm_sg_sentry_service.html#id_r3w_kww_h1b. Best, Hao
... View more
10-11-2017
03:42 PM
AFAIK, there is no easy way to figure out which rows fail to insert given the current implemention of Impala Kudu integration. A related jira for this is 'https://issues.apache.org/jira/browse/IMPALA-4416'. Now you can check the 'NumRowErrors' metric from Impala's query profile. But that only shows the count of failed rows.
... View more