Member since
05-20-2020
125
Posts
4
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
693 | 10-25-2021 01:17 AM | |
968 | 10-18-2021 07:13 AM | |
1012 | 10-03-2021 11:54 PM | |
361 | 07-11-2021 01:21 PM | |
6365 | 04-19-2021 02:50 AM |
01-09-2022
02:23 AM
You can rename the alias, but you will have to do a cleanup of internal topics to avoid confusion. It will be like a new replication (it wont work as a rename and re-initiate from the same point.) Please refer to the below article for the same. https://community.cloudera.com/t5/Customer/SRM-Internal-Topics/ta-p/328811
... View more
12-14-2021
04:08 PM
1 Kudo
Couple of possibilities for this WARN messages are: 1) If there is any GC issue on the datanode, this type of WARN messages is seen. 2) If there is any disk issue 3) the last possibility is network latency/slowness between the application, Kafka node, and datanode.
... View more
12-02-2021
05:11 AM
Hello, This timeout exceptions relates to CM Metrics Store (firehose) being overloaded. Please review the below article - https://community.cloudera.com/t5/Customer/How-to-enable-the-entity-summary-servlet-in-Cloudera-Manager/ta-p/294697 Check KAFKA_PRODUCER and KAFKA_CONSUMER, if we have too many entities (millions), this might cause SMON to request a lot of memory to process the metrics causing timeout exceptions in the SMM server. To avoid a huge amount of entities that will cause issues with services like SMON, use a client.id in your producers, from the consumer point of view, use a group.id to avoid creating random ids every time a client is executed. Alternatively Resetting/deleting the Firehose LevelDB storage could be an option to recover from this. If the SMM server is getting timeout exceptions, check the SMM heap size, it’s recommended (depending on the number of resources we are monitoring) to increase this, acceptable values for production environments are between 8~16GB for SMM.
... View more
10-26-2021
09:46 AM
Depends on your brokers version to support, as the error says. Related KIP-133 (implemented in Kafka 0.11, according to JIRA)
... View more
10-25-2021
01:17 AM
1 Kudo
Hello @roshanbi JSON is one of the best data exchange format which is lightweight and compact. C/DSV can be used when large dataset, with bandwidth limits are to be sent, but DSV cannot be used where the data is complex and unstructured; there comes in JSON. So, as you said, you are are seeing a better performance with DSV, that is because it does not contain attribute-value pair as in JSON (which makes it structured). To fine tune you application, you can use Compression.
... View more
10-18-2021
07:13 AM
Yes, there has to be a corresponding user in Ranger to authorise, it can not be just a certificate. You can use Kafka SSL Authentication by setting it up for 2 way https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/kafka-securing/topics/kafka-secure-auth-tls-broker.html and if you want Authorisation(via Ranger) add the user https://docs.cloudera.com/runtime/7.2.8/security-ranger-authorization/topics/security-ranger-users-groups-permissions.html The same is also discussed here - https://cwiki.apache.org/confluence/display/KAFKA/KIP-371%3A+Add+a+configuration+to+build+custom+SSL+principal+name Hope this helps.
... View more
10-18-2021
04:05 AM
Hello Is this during Kafka start or Kafka Mirror Maker start? What if you try to pass the values in "Kafka Broker Advanced Configuration Snippet (Safety Valve) for ssl.properties" to pass the properties (ssl.keystore.password, ssl.key.password and ssl.truststore.password. i.e, without .generator)? Kafka MirrorMaker has a known issue: .generator properties do not work for producer configs. This affects CDK 3.X (CDH 6.X contains a fix for this issue from 6.2). - CDH-73094 Which version are you using? Please specify both CM and CDH version. This can cause issue too if they are not compatible. Is this an upgrade cluster or new?
... View more
10-05-2021
09:29 AM
I believe the user "some_user" exists but not the 'OU=Dept,O=Company,C=DE,ST=Germany,CN=some_user' You should configure ssl.principal.mapping.rules to skip these warnings.
... View more
10-03-2021
11:54 PM
Hello @DA-Ka The data might be skewed in one of the disks because some of the heavily used topics/partitions are in that particular disk. You may want to profile the data residing (sort the kafka logs.dir) and reassign them to the other disk using the partition reassignment tool kafka-reassign-partition.sh or manually. du -sh /kafka-logs/*
... View more
08-11-2021
10:18 PM
https://www.youtube.com/watch?v=UrX2RWM2vQQ
... View more
07-11-2021
01:41 PM
Hello, Every service should have the KRB5_CONFIG variable exported in its own safety valves. As per the doc - https://community.cloudera.com/t5/Support-Questions/Kerberos-How-to-setup-Cloudera-Manager-in-order-to-use-a/td-p/81138 Goto CM UI -> Schema Registry -> Configuration -> Service Environment Advanced Configuration Snippet (Safety Valve) ADD key: KRB5_CONFIG value : /etc/custom-krb5.conf Save and restart Thanks & Regards, Nandini
... View more
07-11-2021
01:21 PM
Hello, Please refer the document to help you with SSL authentication enabled - https://registry-project.readthedocs.io/en/latest/security.html?highlight=ssl Thanks & Regards, Nandini
... View more
07-11-2021
02:26 AM
It is too late to reply to this post, but was wondering if you found something and are using? Did you want something like ZooNavigator - https://zoonavigator.elkozmon.com/en/docs-pre-v1/
... View more
07-06-2021
12:05 AM
I believe this is what you are looking for https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/srm-using/topics/srm-migrating-consumer-groups.html
... View more
06-11-2021
06:56 AM
Hello Abdul, Can you please check and share the error seen in broker logs during this time frame on bigdatalite.localdomain?
... View more
05-06-2021
10:34 PM
Please try Nifi - Kakfa https://community.cloudera.com/t5/Community-Articles/Apache-NiFi-1-10-Support-for-Parquet-RecordReader/ta-p/282390#:~:text=With%20the%20release%20of%20Apache,data%20as%20a%20single%20unit
... View more
05-04-2021
12:08 AM
Hello @BGabor This error "TrustManager is not specified" can be thrown due to some of the following issues - Missing cert files or missing/wrong values for the below configs Make sure following properties are set:- xasecure.policymgr.clientssl.keystore.credential.file=jceks://file/{{credential_file}}
xasecure.policymgr.clientssl.truststore.credential.file=jceks://file/{{credential_file}}
xasecure.policymgr.clientssl.truststore=/path/to/truststore Also came across these Ranger jiras which indicates that the truststore info not specified in cacert or cacert needs to be manually configured. Note - they are fixed in Ranger 2.0.1, so you may also want to check the ranger version. https://issues.apache.org/jira/browse/RANGER-2611 https://issues.apache.org/jira/browse/RANGER-2907
... View more
04-28-2021
02:34 PM
Please try Kafka connect then, that seems to be the best option suited.
... View more
04-26-2021
09:58 PM
Hello @sriven Found this - https://community.cloudera.com/t5/Support-Questions/How-to-insert-parquet-file-to-Kafka-and-pass-them-to-HDFS/td-p/178340 Please let me know if it helps. Thanks & Regards, Nandini
... View more
04-21-2021
05:01 AM
Hello, What is the file format? Why is it that you say HDFS source connector is also not the solution if files written by spark programming.? Spark - HDFS - Kafka is your entire flow correct? Spark to HDFS they have done now you are looking for HDFS - Kafka. If you can help me understand the file format that Spark saves it while I can find if HDFS Source connector should not be able to help your usecase.
... View more
04-20-2021
03:23 AM
Hello @sriven , - As @Daming Xue mentioned Kafka Connect is one of the good options, the doc https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/kafka-connect/kafka-connect.pdf shares an example of HDFS as a sink connector. https://docs.cloudera.com/cdp-private-cloud-base/7.1.5/kafka-connect/topics/kafka-connect-connector-hdfs-example.html - Flume (CDH) https://docs.cloudera.com/documentation/kafka/latest/topics/kafka_flume.html#concept_rsb_tyb_kv__section_iwb_tyb_kv - Nifi (https://blog.cloudera.com/adding-nifi-and-kafka-to-cloudera-data-platform/ https://community.cloudera.com/t5/Community-Articles/Integrating-Apache-NiFi-and-Apache-Kafka/ta-p/247433 ) - Kafka- Hive Integeration (https://docs.cloudera.com/cdp-private-cloud-base/7.1.5/integrating-hive-and-bi/topics/hive-kafka-integration.html) - Custom Java app (https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/kafka-developing-applications/topics/kafka-develop-example-producer.html) - To try out quickly (testing purpose), you can use the console producer hadoop fs -cat file.txt | kafka-console-producer --broker-list <host:port> --topic <topic> https://docs.cloudera.com/cdp-private-cloud-base/7.1.5/kafka-managing/topics/kafka-manage-cli-producer.html - Spark (which you do not want) https://docs.cloudera.com/cdp-private-cloud-base/7.1.6/developing-spark-applications/topics/spark-using-spark-streaming.html These are some I could quickly think of, there must be many more options. Thanks & Regards, Nandini P.S. If you found this answer useful please upvote/accept.
... View more
04-19-2021
02:50 AM
1 Kudo
Hello @PabloMO , As the Spoiler Error pointed by you,the versions are not matching. You can check it by running "which python" You can override the below two configs in /opt/cloudera/parcels/CDH-<version>/lib/spark/conf/spark-env.sh and restart pyspark. export PYSPARK_PYTHON=<same version of python>
export PYSPARK_DRIVER_PYTHON=<same version of python> Hope it helps. Thanks & Regards, Nandini
... View more
04-19-2021
01:18 AM
Hello @Christ , Yes, it is possible. You can write a custom script or java code(atlas-client) that would extract the metadata from your source file and use the below REST APIs to insert it into Atlas. http://atlas.incubator.apache.org/api/v2/index.html For eg here, you can see a spark atlas connector - https://github.com/hortonworks-spark/spark-atlas-connector/blob/master/README.md#to-use-it-in-secure-environment Hope these documents help - https://docs.cloudera.com/runtime/7.2.7/atlas-leveraging-business-metadata/topics/atlas-business-metadata-bulk-import.html https://docs.cloudera.com/cdp-private-cloud-base/7.1.5/atlas-import-utility/topics/atlas-importing-hive-metadata-utility.html Thanks, Nandini P.S. As always, if you find this post helpful, don't forget to upvote/accept the answer.
... View more
03-29-2021
06:43 PM
As you mentioned certain cli tools are marked not to be supported, for your requirement being the kafka-server-startup and shutdowns scripts. It is not guaranteed that the manual setup initializes the context of Kafka the same way, as Cloudera Manager would. While it may work for the restart but there are environment scripts and some other process management that Cloudera Manager does. Hence, when you restart using Cloudera Manager there would be uncertainties on the state of the process. As you said you have problems accessing the Cloudera Manager UI, you can also use the Cloudera Manager API for your requirement. https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_intro_api.html
... View more
03-17-2021
05:21 PM
1 Kudo
This error shows up if you have selected Sentry/Ranger as dependencies but not checked true for the below config (i.e. did not enable Kerberos) kerberos.auth.enable
... View more
03-16-2021
01:28 AM
Hello @d33play I checked the documents again and found there is an internal jira CDH-72683 which was tracked for CDH versions to support Kafka Storage Handler. Kafka Storage Handler is a part of Hive 3.x version but CDH 6.1.1 comes with Apache Hive 2.1.1 Hence it is not supported in CDH versions but is available with CDP. https://docs.cloudera.com/runtime/7.2.0/release-notes/topics/rt-runtime-component-versions.html https://docs.cloudera.com/cdp-private-cloud-base/7.1.5/runtime-release-notes/topics/rt-pvc-runtime-component-versions.html So, the KafkaStorageHandler package will not work with Hive 2.1.1. Hope this information helps. Thanks and Regards, Nandini
... View more
03-12-2021
09:00 AM
I believe this is what you are looking for https://docs.cloudera.com/cdp-private-cloud-base/7.1.5/integrating-hive-and-bi/topics/hive-kafka-integration.html https://blog.cloudera.com/introducing-hive-kafka-sql/ And a more comprehensive doc - https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.5.1/kafka-hive-integration/hdf-kafka-hive-integration.pdf You would not have to worry about the versions as the components are tested for cross compatibility.
... View more
03-10-2021
03:25 AM
Hello @bhara, Do you have Ranger enabled? What is the Data center version? (I see a bug OPSAPS-58584 on 7.1.5 which was fixed in 7.2.4 , 7.3.0 )
... View more
03-08-2021
12:18 AM
Hello, Have you included the jar in the spark-submit/spark shell command as below (comma separated for multiple jars) $ bin/spark-submit --jars <spark-streaming-kafka-0-8-assembly.jar>
... View more
10-21-2020
07:38 PM
We have two filter options external.kafka.metrics.exclude.prefix and external.kafka.metrics.include.prefix. https://github.com/hortonworks/ambari/blob/AMBARI-2.7.0-maint/ambari-metrics/ambari-metrics-kafka-sink/src/main/java/org/apache/hadoop/metrics2/sink/kafka/KafkaTimelineMetricsReporter.java#L417
... View more