Member since
09-14-2017
118
Posts
10
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1207 | 06-17-2021 06:55 AM | |
1001 | 01-13-2021 01:56 PM | |
14679 | 11-02-2017 06:35 AM | |
14837 | 10-04-2017 02:43 PM | |
30552 | 09-14-2017 06:40 PM |
04-06-2022
01:00 PM
Found some new info in latest Cloudera CDP guides for Impala. It would be nice if they made Hive and Impala more similar for SQL standard syntax but unfortunately it is not same: https://docs.cloudera.com/runtime/7.2.14/impala-sql-reference/topics/impala-identifiers.html Impala identifiers Provides information about using Identifiers as the names of databases, tables, or columns when creating the objects. The following rules apply to identifiers in Impala: The minimum length of an identifier is 1 character. The maximum length of an identifier is currently 128 characters, enforced by the Metastore database.
... View more
02-09-2022
08:57 AM
Hello Experts, I want to run a simple Hive Insert SQL statement in Nifi periodically for example: insert overwrite Table1 select * from Table2; All SQL values are fixed hardcoded and don't need to be changed dynamically in the flow. As a newbie I was thinking I could write the flow as: ReplaceText->success->PutHiveQL Search Value: (?s)(^.*$) Replacement Value: insert overwrite Table1 select * from Table2; But I have an error in ReplaceText which probably needs an incoming flowfile which I don't have since the SQL is fixed hardcoded: "Upstream Connections is invalid because Processor requires an upstream connection but currently has none." The other option I could try but not sure it will work is: GenerateFlowFile -> success -> PutHiveQL GenerateFlowFile Custom Text: insert overwrite Table1 select * from Table2;
... View more
Labels:
- Labels:
-
Apache NiFi
01-24-2022
08:33 AM
Thanks @MattWho , actually found a way to filter/search process groups by name using the Summary option in the top right menu. This is very useful to find all the ETL pipelines once we give proper names and then by entering a part of the name we can show all matching Process Groups. Being a newbie I am trying to compare Streamsets UI to Nifi UI so I can work the same way. Streamsets provides an initial list of all ETL pipelines to filter by name etc. I guess if just after Nifi login if we saw two links: Summary and Canvas then users can click intuitively on the summary screen and review all their Process groups and click on specific PG they want to work with. This would make it similar to other ETL tools like Streamsets.
... View more
01-20-2022
11:57 AM
Hello, Is it still the best practice to create say 100 process groups for 100 dataflow/etl pipelines each of which has multiple processors in each pipeline. Wont 100 process groups be difficult to see on a single canvas? Or any better way so we can easily see and search the 100 ETL pipelines using some filter like name, userid, date etc. to narrow down the list.
... View more
01-01-2022
05:14 PM
Hello @GangWar , Yes I can do kinit -kt for all the userids including yarn, livy and own userid from the same server.
... View more
12-29-2021
09:24 AM
Hello, I am getting an error after upgrade of CDH5.16 to CDP7.1.7. The logs show it is unable to connect to the KMS endpoint. First start a Spark session in sparkmagic then I run below example pyspark code: Starting Spark application ID YARN Application ID Kind State Spark UI Driver log Current session? 23 application_1639802810085_6070 pyspark idle Link Link ✔ SparkSession available as 'spark'. ###############sample pyspark code################### from pyspark.sql import SparkSession spark = SparkSession.builder.master('local').getOrCreate() # load data from .csv file in HDFS # tips = spark.read.csv("/user/hive/warehouse/tips/", header=True, inferSchema=True) # OR load data from table in Hive metastore tips = spark.table('db1.table1') from pyspark.sql.functions import col, lit, mean # query using DataFrame API #tips \ # query using SQL spark.sql("select name from db1.table1").show(3) spark.stop() An error occurred while calling o85.showString.
: java.io.IOException: java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationToken(KMSClientProvider.java:1051)
at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$1.call(LoadBalancingKMSClientProvider.java:255)
at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$1.call(LoadBalancingKMSClientProvider.java:252)
at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:175)
at org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.getDelegationToken(LoadBalancingKMSClientProvider.java:252) Caused by: java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1916)
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationToken(KMSClientProvider.java:1029)
... 67 more
Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: Error while authenticating with endpoint: http://kmshostxyz.com:16000/kms/v1/?op=GETDELEGATIONTOKEN&renewer=yarn%2Fyarnhost%40KERBEROSREALM
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.wrapExceptionWithMessage(KerberosAuthenticator.java:237) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
... 68 more
Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:365)
at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205)
... 78 more
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) ... 79 more
Traceback (most recent call last):
File "/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p0.15945976/lib/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 381, in show
print(self._jdf.showString(n, 20, vertical))
File "/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p0.15945976/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p0.15945976/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
File "/opt/cloudera/parcels/CDH-7.1.7-1.cdh7.1.7.p0.15945976/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling o85.showString.
: java.io.IOException: java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getDelegationToken(KMSClientProvider.java:1051) There is also an old related thread but no resolution: https://community.cloudera.com/t5/Support-Questions/Not-able-to-access-the-files-in-HDFS-encryption-zone-from/m-p/332970#M231320
... View more
Labels:
12-29-2021
07:39 AM
Hi did you find a solution to this?
... View more
12-21-2021
09:51 AM
UPDATE: One possible workaround to suppress these quotes from displaying in select * is to create a view like below in Impala: CREATE VIEW db1.view1 AS SELECT replace(table1.quotedcol1, '"', '') quotedcol1, replace(table1.quotedcol2, '"', '') quotedcol2 FROM db1.table1;
... View more
12-21-2021
08:26 AM
Impala doesnt support the ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' even in newer version like v3.4.0. Any other option to remove double quotes in the output from Impala where the input csv file has quotes?
... View more
12-21-2021
07:57 AM
Hello @lalala @ChethanYM did you find any solution to remove double-quote in impala output on a csv external file table. Input external .gz file has row: true,false,"US","Los Angeles","California","Cloudflare" Output of select * from impala external table defined on .gz file above shows: true false "US" "Los Angeles" "California" "Cloudflare" FYI in AWS Athena doc it says: https://docs.aws.amazon.com/athena/latest/ug/lazy-simple-serde.html LazySimpleSerDe for CSV, TSV, and Custom-Delimited Files Use this SerDe if your data does not have values enclosed in quotes. How to remove the double-quotes.
... View more
12-18-2021
09:04 PM
Thanks for the info we will try to remove Activity Monitor. Cloudera should have automatically removed AM during upgrade to 7.x if deprecated as promised.
... View more
12-18-2021
10:14 AM
Hello, After upgrading to CM 7.4.4/CR 7.1.7 the Activity Monitor never starts. It is always down and cannot be started after multiple attempts. It gives error messages below any thoughts? Command (Start this Activity Monitor (17683)) has failed Supervisor returned FATAL. Please check the role log file, stderr, or stdout. The health test result for ACTIVITY_MONITOR_SCM_HEALTH has become bad: This role's process failed to start. STDOUT messages: 13:06:12.204 [main] ERROR org.hibernate.engine.jdbc.batch.internal.BatchingBatch - HHH000315: Exception executing batch [java.sql.BatchUpdateException: Data truncation: Out of range value for column 'METRIC_ID' at row 1], SQL: insert into CMON_METRIC_INFO (NAME, METRIC_ID) values (?, ?)
13:06:12.209 [main] ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - Data truncation: Out of range value for column 'METRIC_ID' at row 1
Sat Dec 18 13:06:17 EST 2021
JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.292.b10-1.el7_9.x86_64
CONF_DIR=/var/run/cloudera-scm-agent/process/7594-cloudera-mgmt-ACTIVITYMONITOR 13:06:30.645 [main] ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - Data truncation: Out of range value for column 'METRIC_ID' at row 1 at org.hibernate.engine.transaction.internal.TransactionImpl.commit(TransactionImpl.java:101)
... 2 more
Caused by: java.sql.BatchUpdateException: Data truncation: Out of range value for column 'METRIC_ID' at row 1
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.Util.getInstance(Util.java:408)
at com.mysql.jdbc.SQLError.createBatchUpdateException(SQLError.java:1163)
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1823)
at com.mysql.jdbc.PreparedStatement.executeBatchInternal(PreparedStatement.java:1307)
at com.mysql.jdbc.StatementImpl.executeBatch(StatementImpl.java:970)
at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeBatch(NewProxyPreparedStatement.java:2544)
at org.hibernate.engine.jdbc.batch.internal.BatchingBatch.performExecution(BatchingBatch.java:121)
... 22 more
Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Out of range value for column 'METRIC_ID' at row 1
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3976)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3914)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2530)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2683)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2495)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1903)
at com.mysql.jdbc.PreparedStatement.executeUpdateInternal(PreparedStatement.java:2124)
at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1801)
... 26 more
... View more
- Tags:
- Activity monitor
Labels:
- Labels:
-
Cloudera Manager
12-14-2021
12:29 PM
1 Kudo
The script provided in https://github.com/cloudera/cloudera-scripts-for-log4j to delete the JndiLookup.class from log4j jar may have an issue if command zip or unzip is not installed on the cluster nodes. The script should initially check for yum list zip and unzip modules if available and abort if not found. Else it will give a finished message even though it had errors and didnt run successfully like below: Completed removing JNDI from /opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/share/doc/search-1.0.0.7.1.6.0/examples/test-documents/testJPEG_EXIF.jpg.tar.gz Backing up to '/tmp/tmp.FsVTS5Rg9y//opt/cloudera/cm/lib/solr-upgrade-1.0.0.7.1.7.0-547.tar.gz.backup' Patching '/opt/cloudera/cm/lib/solr-upgrade-1.0.0.7.1.7.0-547.tar.gz' Running on '/tmp/tmp.Yxjh6FQYgS' Backing up files to '/tmp/tmp.TsL7gbbmHR' Completed removing JNDI from jar files ./cm_cdp_cdh_log4j_jndi_removal.sh: line 114: unzip: command not found grep: /tmp/unzip_target/**/*.jar: No such file or directory Completed removing JNDI from nar files Recompressing Completed removing JNDI from /opt/cloudera/cm/lib/solr-upgrade-1.0.0.7.1.7.0-547.tar.gz INFO : Finished
... View more
11-24-2021
10:26 AM
Hello @MahendraDevu Did you resolve the error in HUE SAML we are getting this in CDP 7.1.7 after upgrade. SAML was working in CDH5.16 HUE before upgrade: [05/Apr/2019 16:37:03 -0400] views ERROR SAML Identity Provider is not configured correctly: certificate key is missing! UPDATE: Resolved this issue by making the IDP <md:EntityDescriptor entityID same as that on the metadata.xml we specified in HUE Advanced Configuration snippet hue_safety_valve.ini metadata_file . There was a mismatch between IDP value and what was in the metadata file.
... View more
07-28-2021
07:43 AM
After openjdk 1.8.0.292 install we had to fix the mysql scm database ssl connection issue using the property useSSL=false like com.cloudera.cmf.orm.hibernate.connection.url=jdbc:mysql://<mysql-host>/scm?useUnicode=true&characterEncoding=UTF-8&useSSL=false Now same ssl connection issue is happening for nav, navms, amon, rman databases and these components wont start. Can we set the useSSL for these databases also in /etc/cloudera-scm-server/db.properties file like below: com.cloudera.cmf.orm.hibernate.connection.url=jdbc:mysql://localhost/nav?useUnicode=true&characterEncoding=UTF-8&useSSL=false com.cloudera.cmf.orm.hibernate.connection.url=jdbc:mysql://localhost/navms?useUnicode=true&characterEncoding=UTF-8&useSSL=false com.cloudera.cmf.orm.hibernate.connection.url=jdbc:mysql://localhost/rman?useUnicode=true&characterEncoding=UTF-8&useSSL=false
... View more
07-09-2021
01:07 PM
Hello, Did you find a solution for this high memory usage issue? It would appear the experts who designed Kudu memory management in their superior wisdom never bothered to use LRU or other mechanisms to flush old unused pages from memory, just a guess 🙂
... View more
06-26-2021
09:54 AM
Yes /etc/krb5.conf was the issue in the destination TST cluster. Specifically the [domain_realm] section did not have proper mapping for the replication source prod hosts and the prod realm. Since the PRD and TST hosts were all in the same domain so it was not easy to map the domain and realm using just the .proddomainxyz.com = PRODREALM . Instead we had to specify the individual source prod hosts in the destination TST /etc/krb5.conf file as below: [domain_realm] prodhostxx1 = PRODREALM prodhostxx2 = PRODREALM prodhostxx3 = PRODREALM etc. specify all the source prod hosts here If we don't specify the prodhosts like above replication doesn't know how to map the prodhosts to the PRODREALM and picks up the default TSTREALM which is why the original wrong realm error happened. Thanks!
... View more
06-23-2021
01:07 PM
Hello Experts, We are trying to do hdfs replication from prod(CDH 5.x) to test(CDP 7.x) and getting a kerberos principal error: Server has invalid Kerberos principal: hdfs/prodhostxx.com@TESTREALM, expecting: hdfs/prodhostxx.com@PRODREALM Not sure why it is picking up TEST realm instead of the PROD realm in the job. The Peer test shows connected fine. Thanks!
... View more
Labels:
06-17-2021
06:55 AM
1 Kudo
Darren, This got resolved with the help of Cloudera Support engineer. There were couple of issues. First the IDP and SP entity_id should be different values. Next there is a SAML property that needs to be set by the IDP which is <saml:AudienceRestriction><saml:Audience /></saml:AudienceRestriction></saml:Conditions> After setting the Audience property to the entity_id of the SP the error went away and HUE SAML is working again. See below error before setting Audience property: response DEBUG conditions: <?xml version='1.0' encoding='UTF-8'?> <saml:Conditions xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion" NotBefore="2021-06-02T16:02:45.573Z" NotOnOrAfter="2021-06-02T17:02:45.573Z"><saml:AudienceRestriction><saml:Audience /></saml:AudienceRestriction></saml:Conditions> [02/Jun/2021 09:02:45 -0700] client_base ERROR XML parse error: 'NoneType' object has no attribute 'strip' [02/Jun/2021 09:02:45 -0700] middleware INFO Processing exception: 'NoneType' object has no attribute 'strip': Traceback (most recent call last): File "/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hue/build/env/lib/python2.7/site-packages/Django-1.11.29-py2.7.egg/django/core/handlers/base.py", line 185, in _get_response This looks like a breaking change from CDH 5.15 HUE saml to CDP 7.1.6 as we never set the Audience value in CDH 5.x in IDP before.
... View more
05-27-2021
09:39 AM
Some more progress: It appears CDP 7.1.6 we need to create the unencrypted dummy key file as below. To create an unencrypted private key file from an encrypted key we have to run: openssl rsa -in ssl_certificate.key -out ssl_certificate-nocrypt.key The output file (ssl_certificate-nocrypt.key) is an unencrypted PEM-formatted key that is used for the parameter key_file=/opt/cloudera/security/saml/ssl_certificate-nocrypt.key Now this error is gone: Could not deserialize key data. But we are getting different error below: AttributeError at /saml2/acs/ 'NoneType' object has no attribute 'strip' Request Method: POST Request URL: http://xxxx.com:8889/saml2/acs/ Django Version: 1.11.29 Exception Type: AttributeError Exception Value: 'NoneType' object has no attribute 'strip' Exception Location: /opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hue/build/env/lib/python2.7/site-packages/pysaml2-4.9.0-py2.7.egg/saml2/response.py in for_me, line 212 Python Executable: /opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hue/build/env/bin/python2.7 Python Version: 2.7.5 Python Path: ['/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hue/desktop/libs/libsaml/attribute-maps', ------------------------------------------------- Below is the python code in line 212 which errors out: 202 def for_me(conditions, myself): 203 """ Am I among the intended audiences """ 204 205 if not conditions.audience_restriction: # No audience restriction 206 return True 207 208 for restriction in conditions.audience_restriction: 209 if not restriction.audience: 210 continue 211 for audience in restriction.audience: 212 if audience.text.strip() == myself: 213 return True 214 else: 215 # print("Not for me: %s != %s" % (audience.text.strip(), 216 # myself)) 217 pass 218 219 return False
... View more
05-26-2021
04:26 PM
Hello, After we upgraded from CDH 5.15 to CDP 7.1.6 runtime. The HUE SAML login got broken. It gives an error below. Any ideas? ValueError at /saml2/login/ Could not deserialize key data. Request Method: GET Request URL: http://xxxxx.com:8889/saml2/login/?next=/ Django Version: 1.11.29 Exception Type: ValueError Exception Value: Could not deserialize key data. Exception Location: /opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hue/build/env/lib/python2.7/site-packages/cryptography-2.9-py2.7-linux-x86_64.egg/cryptography/hazmat/backends/openssl/backend.py in _handle_key_loading_error, line 1382 Python Executable: /opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hue/build/env/bin/python2.7 Python Version: 2.7.5 Python Path: ['/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/lib/hue/desktop/libs/libsaml/attribute-maps',
... View more
Labels:
- Labels:
-
Cloudera Hue
-
Security
01-15-2021
08:23 AM
There seems to be different version of thrift-sasl and impyla that work or dont work and it is not easy to figure out these version mismatches. So we finally abandoned impyla and went with pyodbc with cloudera impala odbc driver which is easier to make it work and is working good so far. Check out this link: https://plenium.wordpress.com/2020/05/04/use-pyodbc-with-cloudera-impala-odbc-and-kerberos/
... View more
01-13-2021
03:22 PM
DBeaver connection with JDBC kerberos to Hive/Impala is somewhat difficult to make it work. Try an easier method using ODBC as given in https://plenium.wordpress.com/2019/10/15/connect-dbeaver-sql-tool-to-cloudera-hive-impala-with-kerberos/
... View more
01-13-2021
01:56 PM
1 Kudo
This was resolved by manually upgrading the agents in the other nodes which were still at CM5.16 by running the commands below: First update the /etc/yum.repos.d/cloudera-manager.repo on the nodes with the proper repo for CM7.2.4. After that run command on each agent to be upgraded: $yum upgrade cloudera-manager-daemons cloudera-manager-agent After that restart the cloudera manager agents on all nodes: $ systemctl restart cloudera-scm-agent Next go to Cloudera Manager GUI and restart the Cloudera Management Service. After that restart the CDH Cluster all services. This should resolve the issues.
... View more
01-12-2021
03:25 PM
Hello Experts, I have upgraded CDH 5.16 to CDP 7.2.4 Cloudera Manager only( not the runtime yet). After upgrade the Cloudera Manager server is running good including agent. But the Cloudera Agents on the rest of the nodes wont start when running $ sudo systemctl restart cloudera-scm-agent The agents are still at 5.16 as the final step for upgrade does show the upgrade button for the agents. Before the upgrade the agents were running fine. Also the agents where shut down during the Cloudera Manager upgrade. How to start the agents and upgrade them to CDP 7.2.4 now. When trying to restart agents It just shows: $ sudo systemctl status cloudera-scm-agent ● cloudera-scm-agent.service - LSB: Cloudera SCM Agent Loaded: loaded (/etc/rc.d/init.d/cloudera-scm-agent; bad; vendor preset: disabled) Active: active (exited) since Tue 2021-01-12 17:52:10 EST; 7s ago Docs: man:systemd-sysv-generator(8) Process: 14093 ExecStop=/etc/rc.d/init.d/cloudera-scm-agent stop (code=exited, status=0/SUCCESS) Process: 14210 ExecStart=/etc/rc.d/init.d/cloudera-scm-agent start (code=exited, status=0/SUCCESS) In the /var/log/cloudera-scm-agent/cloudera-scm-agent.log I see some below messages: u'KS_INDEXER', u'ZOOKEEPER-SERVER', u'SERVER', u'HIVE_ON_TEZ', u'HIVE_ON_TEZ', u'HIVE_LLAP', u'HIVE_LLAP', u'KEYTRUSTEE_SERVER', u'KEYTRUSTEE_SERVER', u'SCHEMAREGISTRY-SCHEMA_REGISTRY_SERVER', u'SCHEMA_REGISTRY_SERVER', u'OZONE', u'OZONE', u'MAPREDUCE-JOBTRACKER', u'JOBTRACKER', u'THALES_KMS-HSMKP_THALES', u'HSMKP_THALES'], u'flood_seed_timeout': 100, u'eventserver_port': 7185} Traceback (most recent call last): File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.2-py2.7.egg/cmf/agent.py", line 1566, in handle_heartbeat_response self._handle_heartbeat_response(response) File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.16.2-py2.7.egg/cmf/agent.py", line 1581, in _handle_heartbeat_response self.java_home_config = self.extra_configs['JAVA_HOME'] KeyError: 'JAVA_HOME' Any thoughts? Thanks!
... View more
Labels:
01-09-2021
09:19 AM
@GangWar you are a genius! After this java parameter change all CDH services started smoothly and everything running fine with Active Directory kerberos. Thanks so much!
... View more
01-08-2021
12:36 PM
Hello Experts, After changing MIT Kerberos to AD Kerberos and Regenerating all the Kerberos credentials in CM the zookeeper, YARN etc. is not starting. There is an error about the Active Directory samaccount not able to login as the zookeeper principal. I checked that the principals are created in the AD OrgUnit for Cloudera. And the $ kinit -kt zookeeper.keytab zookeeper/redacted@ADREALM on the linux servers works fine. Any thoughts how to fix? SERVICE_TYPE ZOOKEEPER SEVERITY CRITICAL STACKTRACE javax.security.sasl.SaslException: Problem with callback handler [Caused by javax.security.sasl.SaslException: redacted@ADREALM.COM is not authorized to connect as zookeeper/redacted@ADREALM.COM] at com.sun.security.sasl.gsskerb.GssKrb5Server.doHandshake2(GssKrb5Server.java:333) at com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:161) at org.apache.zookeeper.server.quorum.auth.SaslQuorumAuthServer.authenticate(SaslQuorumAuthServer.java:98) at org.apache.zookeeper.server.quorum.QuorumCnxManager.handleConnection(QuorumCnxManager.java:449) at org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:387) at org.apache.zookeeper.server.quorum.QuorumCnxManager$QuorumConnectionReceiverThread.run(QuorumCnxManager.java:423) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: javax.security.sasl.SaslException: Thanks!
... View more
Labels:
01-07-2021
02:04 PM
One specific issue with Impala connection pool timeout error got resolved by increasing in Impala configuration in CM and then restarting Impala daemon: fe_service_threads from 64 =====> increased to 128 as per recommendation below. ------------------------------------------------------------------------------------------------------ Following are the recommended configuration setting for the best performance with Impala. https://docs.cloudera.com/best-practices/latest/impala-performance/topics/bp-impala-recommended-configurations.html#:~:text=Set%20the%20%2D%2Dfe_service_threads%20startup,of%20concurrent%20client%20connections%20allowed. Set the -- fe_service_threads startup option for the Impala daemon ( impalad ) to 256. This option specifies the maximum number of concurrent client connections allowed. See Startup Options for impalad Daemon for details. Below are the errors that got resolved by increasing Impala daemon pool size: --------------------------------------------------------------------------------------------------------- com.streamsets.pipeline.api.StageException : JDBC_06 - Failed to initialize connection pool: com.zaxxer.hikari.pool.PoolInitializationException : Exception during pool initialization: [Cloudera][ ImpalaJDBCDriver ](700100) Connection timeout expired. Details: None. SQL Error [3] [S1000]: [Cloudera][ ThriftExtension ] (3) Error occurred while contacting server: ETIMEDOUT . The connection has been configured to use a SASL mechanism for authentication. This error might be due to the server is not using SASL for authentication.
... View more
12-22-2020
07:36 AM
Hello Experts, Any thoughts or documents on how to configure CDH 7.x Kerberos for central authentication with Active Directory where users are in multiple AD domains/realms and no trust setup between domains in an AD forest? I believe SSSD can be configured to authenticate the linux users to multiple AD realms but the question is how CDH cluster services like HDFS can be made to trust kerberos tickets from multiple AD domains. Thanks!
... View more
11-18-2020
04:06 PM
Thanks for the solution!! Same issue for me after enabling MIT Kerberos in the CDH 5.16.2 cluster zookeeper wouldn't start with the above message javax.security.auth.login.LoginException: Message stream modified (41) I was using openjdk version "1.8.0_272". As per your solution commented the line in /etc/krb5.conf on all servers: #renew_lifetime = 604800 After that restart of cluster all services worked except Hue Kerberos Ticket Renewer which gives error Couldn't renew kerberos ticket in order to work around Kerberos 1.8.1 issue. Please check that the ticket for 'hue/fqdn@KRBREALM' is still renewable: The Kerberos Ticket Renewer is a separate issue and we need to run on the MIT KDC server: kadmin.local: modprinc -maxrenewlife 90day krbtgt/KRBREALM kadmin.local: modprinc -maxrenewlife 90day +allow_renewable hue/fqdn@KRBREALM for all hue servers fqdn After that Hue Kerberos Ticket Renewer restarted successfully.
... View more