About EricL

EricL · ‎05-18-2019

Hi Tomas, This message is normal behaviour and expected to happen when the Datanode's security key manager rolls its keys. It will cause clients to print this whenever they use the older cached keys, but the post-action of printing this message is that the client refetches the new key and the job completes. Since Impala is a client of HDFS, there is no concern or worry about this message, as it is part of normal operation. We also see this from HBase logs, which is again, normal. Hope above helps. Cheers Eric

EricL · ‎05-18-2019

Hi Tomas, Couple questions: 1. is it happening on all tables with INSERT operation or just certain tables? Any pattern with the type of tables that are failing? 2. what version of CDH are you using? 3. can you please locate the host of the failed fragment and find the full stacktrace for this exception? This might be helpful to see what next step we can do. Cheers Eric

EricL · ‎05-11-2019

Hi, It looks like that you running spark in cluster mode, and your ApplicationMaster is running OOM. In cluster mode, the Driver is running inside the AM, I can see that you have Driver of 110G and executor memory of 12GB. Have you tried to increase both of them to see if it can help? How much I do not know, but maybe slowly increase to and keep trying. However, the driver memory of 110GB seems to be a lot, am wondering what kind of dataset is this Spark job processing? How large is the volume? Cheers Eric

EricL · ‎05-11-2019

Hi, Have you tried to get the logs from command line? For secured cluster, run as "yarn" superuser: yarn logs -applicationId {application_id} -appOwner {username} For unsecured cluster, run as "root" superuser: sudo -u yarn yarn logs -applicationId {application_id} Will you get the result? Without seeing errors, it would be hard to tell why you don't see next page when clicking "History" button on that page, it can be caused by lots of factors. Cheers Eric

EricL · ‎05-10-2019

Hi Ajay, As I mentioned before in the previous post: Hue will hold the query handler open so that it can do paginations, and it will only kill the handler after user navigates away from the impala page. If user stays on the page, the handler will be kept open and the query is considered as in flight. This is intended and part of design. If you do not want it to be open for long time, you need to set the idle_session_timeout at Impala level. Cheers Eric

EricL · ‎05-10-2019

Hi, In that case replace it using actual file might be a good test to confirm if the soft link could be the cause here. Do you know how often it happens? Can you scan through the HS2 logs and see the timing? The pattern might also help to tell some story. Currently running out of ideas.

EricL · ‎05-09-2019

Hi Sona, >>> But this problem makes things tougher frm the admins perspective, jobs submitted from Hue is running as Hive user on Yarn.. The jobs will be submitted under queue that is configured in the cluster, so resources can still be controlled based on the end users, not "hive" user. >>> Also most of the users will be creating external tables for thier work n store it at thier respective hdfs path, so setting the path ownership as user:usergrp is prohibitting the "disabled impersonated" hive user from hue,, to unable to write at the mentioned path.... All HDFS path that you store data for Hive databases/tables should be owned by "hive" and the permissions for end users should be done via Sentry HDFS sentry, by granting permissions to end users via Sentry and ACL will be synced to HDFS. So everything is managed by hive/sentry, and hive/sentry can give permissions to end users. >>> So everytime have to set acl for everyone?.. You can setup at DB level, so no need to set it for every table >>> and every sub directory ownership will change?.. Yes >>> What if the user if running on beeline?.. so still change the path ownership to hive:hive?.. After enabling Sentry, you should have switched to beeline already, Hive CLI is deprecated and will not work properly in Sentry enabled environment. Hope above helps. Cheers Eric

EricL · ‎05-09-2019

Hi, I thought you just update the partition column type. If they contain different data, how will you plan to manage the old partitions? This will affect how you implement the change. Cheers Eric

EricL · ‎05-08-2019

Hi, I am seeing CancelOperation: HardyTCLIServiceThreadSafeClient::CancelOperation Have you checked on the HS2 log to see if anything useful from server side? It does not make sense that SELECT works but INSERT will get timeout, there is no separate configuration for them. What about CTAS query? Cheers Eric

EricL · ‎05-08-2019

Have you tried my suggestion, which won't need to do data copying? Cheers Eric

Online	Offline
Last Visited	‎08-12-2020 03:17 AM

Member Since	‎03-23-2015 01:24 PM
Last Visited	‎08-12-2020 03:17 AM
Posts	1,288
Kudos received	113

Cloudera Community

Re: max() function generating an error in sqoop

Re: Add a dynamic variable to a Hive view

Re: Hive Server 2 failing to start CDP ,Cloudera M...

Re: Sqoop export from hive to teradata - > issue ...

Re: Cloudera Hadoop internal workings

Re: Impala cannot read via short circuit

Re: Impala cannot write to HDFS

Re: Container is running beyond memory limits - R...

Re: kileld application not showing up in history s...

Re: Impala Queries Executing long time

Re: Hive reloadable udf: random 'Unable to find c...

Re: hive impersonation and sentry

Re: How to change partition on existing Hive Table...

Re: HIVE ODBC Driver Timeout on INSERT

Re: How to change partition on existing Hive Table...