About Pettax

Pettax · ‎02-22-2019

Hi, I have not followed the development of Impala lateley.If this i still a limitation you might try the following approach.Design the schema with an additional column with information about which rows holds information for a particular struct column and then use this additional column in the WHERE clause. Something like: name complex1 complex2 complex3 complex1 content NULL NULL complex3 NULL NULL content and then: SELECT complex1.* FROM myTable WHERE name = 'complex1' Br, Petter

Pettax · ‎10-09-2018

Hi all, we have our cluster deployed on AWS EC2 instances where some of the worker noedes are on spot instances. Usually there is no problem when spot instances disapear. We have time to decomission them from CM. Recently we have started to experience a ResourceManager crash in connection when we loose spot instances. See log below. After the ResourceManager crashes it does not restart automatically and after a while, all of our remaining NodeManger processes are shut down as well leaving no YARN capacity left at all eventhough we have plenty of helthy machines. We are using CDH 5.14.2. 1. Is the problem in the stack trace below known (Timer allready cancelled) 2. Can we change the configuration to have the ResourceManager automatically recover from this? I only see a automatically restart option for JobHistory server in CM but perhaps this is the same process? Br, Petter 2018-10-08 16:14:45,617 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, FSPreemptionThread, that exited unexpectedly: java.lang.IllegalStateException: Timer already cancelled. at java.util.Timer.sched(Timer.java:397) at java.util.Timer.schedule(Timer.java:193) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.preemptContainers(FSPreemptionThread.java:212) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:77) 2018-10-08 16:14:45,623 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down the resource manager. 2018-10-08 16:14:45,624 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2018-10-08 16:14:45,629 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@ip-10-255-4-86.eu-west-1.compute.internal:8088 2018-10-08 16:14:45,731 INFO org.apache.hadoop.ipc.Server: Stopping server on 8032 2018-10-08 16:14:45,732 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8032 2018-10-08 16:14:45,732 INFO org.apache.hadoop.ipc.Server: Stopping server on 8033 2018-10-08 16:14:45,732 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2018-10-08 16:14:45,732 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8033 2018-10-08 16:14:45,733 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder 2018-10-08 16:14:48,250 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at ip-10-255-4-86.eu-west-1.compute.internal/10.255.4.86:8033 2018-10-08 16:14:49,643 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ip-10-255-4-86.eu-west-1.compute.internal/10.255.4.86:8033. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-10-08 16:14:50,644 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ip-10-255-4-86.eu-west-1.compute.internal/10.255.4.86:8033. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-10-08 16:14:51,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ip-10-255-4-

Pettax · ‎12-12-2017

Hi, great! It solved my problem! For other users in the future: We upgraded a 5.10.1 cluster (without Kudu) to a 5.12.1 cluster (with Kudu). The missing part was the configuration option 'Kudu Service' that was set to none in the Impala Service-Wide configuration. Setting this to Kudu insert the impalad startup option -kudu_master_hosts and after that I can create tables without the TBLPROPERTIES clause and Sentry now works as expected. Thank you very much, Hao!

Pettax · ‎12-11-2017

Hi, >Would you mind sharing the query how you create a new table? Did you happen to set kudu master addresses in TBLPROPERTIES clause? I did use the TBLPROPERTIES clause. I read somewhere that it should not be needed if running in a CM environment but in our case we have to specify it. I see now that CM has not added the --tserver_master_addrs flas to the gflagfile. See belwo for a simplified CREATE TABLE statement. CREATE TABLE my_db.my_table ( key BIGINT, value STRING, PRIMARY KEY(key) ) PARTITION BY RANGE (key) ( PARTITION 1 <= VALUES < 1000 ) STORED AS KUDU TBLPROPERTIES ('kudu.master_addresses'='my-master-address'); Are you saying that it will work (with Sentry) if we add the --tserver_master_addrs to the tservers and remove the TBLPROPERTIES clause? Br, Petter

Pettax · ‎12-08-2017

Hi, thank you for your reply! >Sorry, I missed that you are using external Kudu tables in the previous reply. They are in fact internal tables. I do not use the EXTERNAL keyword when creating the tables. The only way I can let one user group (ROLE in Sentry) create their own Kudu tables (via Impala) is to give the ALL privilegies on the server level. This has the side effect that this user group will enyoy access to all data on the cluster. This is not desired. Granting ALL on the (impala) db level does not help. Have I missed something? Will finer grained access arrive in the future? Br, Petter

Pettax · ‎12-06-2017

Hi, we have a sentry role that have action=ALL on db=my_db When trying to issue a CREATE TABLE statment in Impala to create a Kudu table in my_db we get the following error: I1205 12:32:21.124711 47537 jni-util.cc:176] org.apache.impala.catalog.AuthorizationException: User 'my_user' does not have privileges to access: name_of_sentry_server A work-around is to set action=ALL on the server level to the sentry role but we don't want to give this wide permission to the role. Do we need to set action=ALL on the server level in order to delegate the rights to our users to create Kudu tables or how could we set up Sentry in this case? We use CDH 5.12.1 (Kudu 1.4.0) Br, Petter

Pettax · ‎01-10-2017

Hi Tim, thank you for taking the time to look at this issue! Br, Petter

Pettax · ‎01-10-2017

Hi all, I reported IMPALA-4725 last week but it seems like it has not been triaged yet. I wanted to bring some more attention to this issue (and possible suggestions for workarounds) since it has a heavy impact on us. To summarize it seems like Impala mixes-up values in arrays of structs which to me seems like a fundamental problem in the parquet reader. Alternatively the values gets mixed-up when presented as a result. Either way, I would very much appreciated an initiated persons view on this issue. We are running Impala that is bundled with CDH 5.8.3 Br, Petter

Pettax · ‎12-06-2016

Thank you Sailesh! This solved my problem. Br, Petter

Pettax · ‎11-29-2016

Hi all, Best Practices for Using Impala with S3 states "Set the safety valve fs.s3a.connection.maximum to 1500 for impalad." Can annyone clarify which safety valve field should be used and with what syntax? I'm reading somewhere that this setting belongs to core-site.xml but Impala configuration in Cloudera Manger does not seem to have a safety valve for core-site.xml. The instructions mentions safety valve for impalad but that safety valve seems to be for command line arguments to impalad. The problem we are trying to adress is hdfsSeek(desiredPos=503890631): FSDataInputStream#seek error: com.cloudera.com.amazonaws.AmazonClientException: Unable to execute HTTP request: Timeout waiting for connection from pool that we keep getting when using Impala for querying data stored in S3. We are using CDH 5.8.3 Thanks, Petter

Online	Offline
Last Visited	‎02-22-2019 03:48 AM

Member Since	‎01-07-2016 07:45 AM
Last Visited	‎02-22-2019 03:48 AM
Posts	26
Kudos received	8

Cloudera Community

Re: Not possible to use IS NULL / IS NOT NULL oper...

ResourceManager crashes

Re: How to configure Sentry to allow creating a Ku...

Re: How to configure Sentry to allow creating a Ku...

Re: How to configure Sentry to allow creating a Ku...

How to configure Sentry to allow creating a Kudu t...

Re: Impala has problems reading complex types from...

Impala has problems reading complex types from Par...

Re: Setting max S3 connections

Setting max S3 connections