About ebeb

GangWar · ‎01-12-2021

@ebeb What you are describing would be a scenario for cross realm trust. In such a scenario you might have all of the cluster principals in realm A and all of the users in Realm B. With Trust established between A and B. Here is the doc for reference: https://docs.cloudera.com/documentation/enterprise/latest/topics/cm_sg_kdc_def_domain_s2.html

ebeb · ‎01-07-2021

One specific issue with Impala connection pool timeout error got resolved by increasing in Impala configuration in CM and then restarting Impala daemon: fe_service_threads from 64 =====> increased to 128 as per recommendation below. ------------------------------------------------------------------------------------------------------ Following are the recommended configuration setting for the best performance with Impala. https://docs.cloudera.com/best-practices/latest/impala-performance/topics/bp-impala-recommended-configurations.html#:~:text=Set%20the%20%2D%2Dfe_service_threads%20startup,of%20concurrent%20client%20connections%20allowed. Set the --fe_service_threads startup option for the Impala daemon (impalad) to 256. This option specifies the maximum number of concurrent client connections allowed. See Startup Options for impalad Daemon for details. Below are the errors that got resolved by increasing Impala daemon pool size: --------------------------------------------------------------------------------------------------------- com.streamsets.pipeline.api.StageException: JDBC_06 - Failed to initialize connection pool: com.zaxxer.hikari.pool.PoolInitializationException: Exception during pool initialization: [Cloudera][ImpalaJDBCDriver](700100) Connection timeout expired. Details: None. SQL Error [3] [S1000]: [Cloudera][ThriftExtension] (3) Error occurred while contacting server: ETIMEDOUT. The connection has been configured to use a SASL mechanism for authentication. This error might be due to the server is not using SASL for authentication.

ebeb · ‎11-18-2020

Thanks for the solution!! Same issue for me after enabling MIT Kerberos in the CDH 5.16.2 cluster zookeeper wouldn't start with the above message javax.security.auth.login.LoginException: Message stream modified (41) I was using openjdk version "1.8.0_272". As per your solution commented the line in /etc/krb5.conf on all servers: #renew_lifetime = 604800 After that restart of cluster all services worked except Hue Kerberos Ticket Renewer which gives error Couldn't renew kerberos ticket in order to work around Kerberos 1.8.1 issue. Please check that the ticket for 'hue/fqdn@KRBREALM' is still renewable: The Kerberos Ticket Renewer is a separate issue and we need to run on the MIT KDC server: kadmin.local: modprinc -maxrenewlife 90day krbtgt/KRBREALM kadmin.local: modprinc -maxrenewlife 90day +allow_renewable hue/fqdn@KRBREALM for all hue servers fqdn After that Hue Kerberos Ticket Renewer restarted successfully.

GangWar · ‎08-15-2020

@ebeb Yeah, “Private Cloud Base” is the new name for CDP Data Center. Sorry for the inconvenience. Please use the same doc.

ebeb · ‎08-05-2020

Thanks for the info. This is not well documented in the CM upgrade page which will be good to have this info how to generate credentials: https://docs.cloudera.com/cdp/latest/upgrade-cdh/topics/ug_cm_upgrade_server.html However after clicking the Download Now option from your link I now get different error which means we need to work with Cloudera to get entitled for CDP Data Center license. Thanks! Access Restricted You must be a CDP Data Center customer to access these downloads. If you believe you should have this entitlement then please reach out to support or your customer service representative.

Rameshbs · ‎05-09-2020

I did the following steps and it helped 1) [root@cm_scm_103 cloudera-scm-agent]# cp /opt/cloudera/parcel-repo/STREAMSETS_DATACOLLECTOR-3.14.0-el7.parcel.torrent /opt/cloudera/parcel-cache/ [root@cm_scm_103 cloudera-scm-agent]# ls -ltr /opt/cloudera/parcel-cache/STREAMSETS_DATACOLLECTOR-3.14.0-el7.parcel.torrent -rw-r----- 1 root root 211751 May 9 11:14 /opt/cloudera/parcel-cache/STREAMSETS_DATACOLLECTOR-3.14.0-el7.parcel.torrent You have mail in /var/spool/mail/root [root@cm_scm_103 cloudera-scm-agent]# 2) make sure enough free space in /opt/cloudera/parcels 3) Restarted cloudera-scm-agent in server_b01, server_b02 and cm_scm_103 [root@server_b01 parcel-cache]# systemctl restart cloudera-scm-agent 4) Restarted cloudera-scm-server in cm_scm_103 [root@cm_scm_103 cloudera-scm-agent]# sudo service cloudera-scm-server restart

ebeb · ‎01-28-2020

You are absolutely right. After struggling with various syntax, realized CDH5.16 version impalad version 2.12.0-cdh5.16.1 doesnt support get_json_object(). So finally using json.dumps() removed all the unicode u' characters and also removed all strange characters in the json fields to normal characters like CH4_NO2_WE_AUX. After that ended up using Hive instead of Impala with a query like below to extract the values as columns. The json_column1 is a string datatype. -------------------------------------------- select b.b1, c.c1,c.c2,d.d1,d.d2 from json_table1 a lateral view json_tuple(a.json_column1, 'CH4_NO2_WE_AUX', 'CH7_CO_CONCENTRATION_WE') b as b1,b2 lateral view json_tuple(b.b1,'unit','value') c as c1,c2 lateral view json_tuple(b.b2,'unit','value') d as d1,d2 ;

AKR · ‎01-05-2020

Hi, This parameter spark.executor.memory (or) spark.yarn.executor.memoryOverhead can be set in Spark submit command or you can set it Advanced configurations. Thanks AKR

P_ · ‎12-12-2019

Any suggestion to above request?

yukimojo · ‎04-12-2019

Dear Romainr： I follow your advise to modify the srcCode and made the hue3.9 notebook worked on my CDH5.7.1 Kerberized cluster with livy0.5.0.Now i can using spark-shell on notebook and its run well. but on yarn i saw the spark job user was always livy(i set livy be a haddop.proxyuser), seems its bind on the user which the keytab of livy-server-luancher. so i can't control the notebook authority by hue. i can see the hue set proxyuser on a the token, but the spark job's user was not be reset. livy-conf livy.impersonation.enabled=true livy.repl.enable-hive-context=true livy.spark.deploy-mode=client livy.spark.master=yarn livy.superusers=hue livy.server.auth.type=kerberos livy.server.auth.kerberos.keytab=/etc/security/keytabs/spnego.keytab livy.server.auth.kerberos.principal=HTTP/xxx.com@xxx.COM livy.server.launch.kerberos.keytab=/etc/security/keytabs/livy.keytab livy.server.launch.kerberos.principal=livy/xxx.com@xxx.COM livy-log 19/04/12 14:48:14 INFO InteractiveSession$: Creating Interactive session 3: [owner: hue, request: [kind: pyspark, proxyUser: Some(baoyong), heartbeatTimeoutInSecond: 0]] 19/04/12 14:48:25 INFO LineBufferedStream: stdout: client token: Token { kind: YARN_CLIENT_TOKEN, service: } 19/04/12 14:48:25 INFO LineBufferedStream: stdout: diagnostics: N/A 19/04/12 14:48:25 INFO LineBufferedStream: stdout: ApplicationMaster host: 192.168.103.166 19/04/12 14:48:25 INFO LineBufferedStream: stdout: ApplicationMaster RPC port: 0 19/04/12 14:48:25 INFO LineBufferedStream: stdout: queue: root.livy 19/04/12 14:48:25 INFO LineBufferedStream: stdout: start time: 1555051701595 19/04/12 14:48:25 INFO LineBufferedStream: stdout: final status: UNDEFINED 19/04/12 14:48:25 INFO LineBufferedStream: stdout: tracking URL: http://bigdata166.xxx.com:8088/proxy/application_1555044848792_0063/ 19/04/12 14:48:25 INFO LineBufferedStream: stdout: user: livy 19/04/12 14:48:28 INFO InteractiveSession: Interactive session 3 created [appid: application_1555044848792_0063, owner: hue, proxyUser: None, state: idle, kind: pyspark, info: {driverLogUrl=null, sparkUiUrl=null}]

Online	Offline
Last Visited	‎12-20-2023 04:37 PM

Member Since	‎09-14-2017 07:07 AM
Last Visited	‎12-20-2023 04:37 PM
Posts	120
Kudos received	11

Cloudera Community

Re: HUE SAML error after upgrade to CDP 7.1.6

Re: CDP 7.2.4 upgrade - cloudera agents not starti...

Re: How to run Python script in Hue through oozie

Re: Cluster installation failure - src file /opt/c...

Re: spark.yarn.executor.memoryOverhead

Re: Configure CDH 7.x Kerberos in multiple Active ...

Re: Connection timeout expired while connecting to...

Re: How to solve the `Message stream modified (41)...

Re: CDP Data Center vs Private Cloud upgrade docs

Re: CM 7.x upgrade error: [Errno 14] PYCURL ERROR ...

Re: Cluster installation failure - src file /opt/c...

Re: Nested JSON to columns using Impala SQL functi...

Re: spark.yarn.executor.memoryOverhead

Re: PySpark + YARN + Kerberos = Chaos?

Re: 401 Unauthorized Error Livy Hue