Member since
01-07-2016
26
Posts
8
Kudos Received
0
Solutions
02-22-2019
12:27 AM
Hi, I have not followed the development of Impala lateley.If this i still a limitation you might try the following approach.Design the schema with an additional column with information about which rows holds information for a particular struct column and then use this additional column in the WHERE clause. Something like: name complex1 complex2 complex3
complex1 content NULL NULL
complex3 NULL NULL content and then: SELECT complex1.*
FROM myTable
WHERE name = 'complex1' Br, Petter
... View more
10-09-2018
02:58 AM
Hi all, we have our cluster deployed on AWS EC2 instances where some of the worker noedes are on spot instances. Usually there is no problem when spot instances disapear. We have time to decomission them from CM. Recently we have started to experience a ResourceManager crash in connection when we loose spot instances. See log below. After the ResourceManager crashes it does not restart automatically and after a while, all of our remaining NodeManger processes are shut down as well leaving no YARN capacity left at all eventhough we have plenty of helthy machines. We are using CDH 5.14.2. 1. Is the problem in the stack trace below known (Timer allready cancelled) 2. Can we change the configuration to have the ResourceManager automatically recover from this? I only see a automatically restart option for JobHistory server in CM but perhaps this is the same process? Br, Petter 2018-10-08 16:14:45,617 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, FSPreemptionThread, that exited unexpectedly: java.lang.IllegalStateException: Timer already cancelled.
at java.util.Timer.sched(Timer.java:397)
at java.util.Timer.schedule(Timer.java:193)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.preemptContainers(FSPreemptionThread.java:212)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSPreemptionThread.run(FSPreemptionThread.java:77)
2018-10-08 16:14:45,623 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down the resource manager.
2018-10-08 16:14:45,624 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2018-10-08 16:14:45,629 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@ip-10-255-4-86.eu-west-1.compute.internal:8088
2018-10-08 16:14:45,731 INFO org.apache.hadoop.ipc.Server: Stopping server on 8032
2018-10-08 16:14:45,732 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8032
2018-10-08 16:14:45,732 INFO org.apache.hadoop.ipc.Server: Stopping server on 8033
2018-10-08 16:14:45,732 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2018-10-08 16:14:45,732 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8033
2018-10-08 16:14:45,733 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2018-10-08 16:14:48,250 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at ip-10-255-4-86.eu-west-1.compute.internal/10.255.4.86:8033
2018-10-08 16:14:49,643 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ip-10-255-4-86.eu-west-1.compute.internal/10.255.4.86:8033. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-10-08 16:14:50,644 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ip-10-255-4-86.eu-west-1.compute.internal/10.255.4.86:8033. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2018-10-08 16:14:51,647 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: ip-10-255-4-
... View more
Labels:
- Labels:
-
Apache YARN
-
Cloudera Manager
12-12-2017
07:10 AM
Hi, great! It solved my problem! For other users in the future: We upgraded a 5.10.1 cluster (without Kudu) to a 5.12.1 cluster (with Kudu). The missing part was the configuration option 'Kudu Service' that was set to none in the Impala Service-Wide configuration. Setting this to Kudu insert the impalad startup option -kudu_master_hosts and after that I can create tables without the TBLPROPERTIES clause and Sentry now works as expected. Thank you very much, Hao!
... View more
12-11-2017
12:27 AM
Hi, >Would you mind sharing the query how you create a new table? Did you happen to set kudu master addresses in TBLPROPERTIES clause? I did use the TBLPROPERTIES clause. I read somewhere that it should not be needed if running in a CM environment but in our case we have to specify it. I see now that CM has not added the --tserver_master_addrs flas to the gflagfile. See belwo for a simplified CREATE TABLE statement. CREATE TABLE my_db.my_table
(
key BIGINT,
value STRING,
PRIMARY KEY(key)
)
PARTITION BY RANGE (key)
(
PARTITION 1 <= VALUES < 1000
)
STORED AS KUDU
TBLPROPERTIES ('kudu.master_addresses'='my-master-address'); Are you saying that it will work (with Sentry) if we add the --tserver_master_addrs to the tservers and remove the TBLPROPERTIES clause? Br, Petter
... View more
12-08-2017
05:50 AM
Hi, thank you for your reply! >Sorry, I missed that you are using external Kudu tables in the previous reply. They are in fact internal tables. I do not use the EXTERNAL keyword when creating the tables. The only way I can let one user group (ROLE in Sentry) create their own Kudu tables (via Impala) is to give the ALL privilegies on the server level. This has the side effect that this user group will enyoy access to all data on the cluster. This is not desired. Granting ALL on the (impala) db level does not help. Have I missed something? Will finer grained access arrive in the future? Br, Petter
... View more
12-06-2017
04:53 AM
Hi, we have a sentry role that have action=ALL on db=my_db When trying to issue a CREATE TABLE statment in Impala to create a Kudu table in my_db we get the following error: I1205 12:32:21.124711 47537 jni-util.cc:176] org.apache.impala.catalog.AuthorizationException: User 'my_user' does not have privileges to access: name_of_sentry_server A work-around is to set action=ALL on the server level to the sentry role but we don't want to give this wide permission to the role. Do we need to set action=ALL on the server level in order to delegate the rights to our users to create Kudu tables or how could we set up Sentry in this case? We use CDH 5.12.1 (Kudu 1.4.0) Br, Petter
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache Kudu
-
Apache Sentry
01-10-2017
11:34 AM
Hi Tim, thank you for taking the time to look at this issue! Br, Petter
... View more
01-10-2017
05:37 AM
1 Kudo
Hi all, I reported IMPALA-4725 last week but it seems like it has not been triaged yet. I wanted to bring some more attention to this issue (and possible suggestions for workarounds) since it has a heavy impact on us. To summarize it seems like Impala mixes-up values in arrays of structs which to me seems like a fundamental problem in the parquet reader. Alternatively the values gets mixed-up when presented as a result. Either way, I would very much appreciated an initiated persons view on this issue. We are running Impala that is bundled with CDH 5.8.3 Br, Petter
... View more
Labels:
- Labels:
-
Apache Impala
11-29-2016
07:25 AM
Hi all, Best Practices for Using Impala with S3 states "Set the safety valve fs.s3a.connection.maximum to 1500 for impalad." Can annyone clarify which safety valve field should be used and with what syntax? I'm reading somewhere that this setting belongs to core-site.xml but Impala configuration in Cloudera Manger does not seem to have a safety valve for core-site.xml. The instructions mentions safety valve for impalad but that safety valve seems to be for command line arguments to impalad. The problem we are trying to adress is hdfsSeek(desiredPos=503890631): FSDataInputStream#seek error: com.cloudera.com.amazonaws.AmazonClientException: Unable to execute HTTP request: Timeout waiting for connection from pool that we keep getting when using Impala for querying data stored in S3. We are using CDH 5.8.3 Thanks, Petter
... View more
Labels:
- Labels:
-
Apache Impala
-
Cloudera Manager