Member since
11-12-2018
218
Posts
179
Kudos Received
35
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1265 | 08-08-2025 04:22 PM | |
| 1649 | 07-11-2025 08:48 PM | |
| 2595 | 07-09-2025 09:33 PM | |
| 1543 | 04-26-2024 02:20 AM | |
| 2160 | 04-18-2024 12:35 PM |
07-05-2022
05:31 PM
Hi @Jessica_cisco, it looks like conflicts in the versions/classes. Kindly locate duplicate dependencies and then clean rebuild again.
... View more
07-04-2022
05:03 PM
Hi @mamoune, you can inject multiple concurrent data source types to the Cloudera CDP platform but make sure you have an inbound connection configured apparently from the source to the destination CDP cluster. There are various components/connectors that are useful for both moving and transforming data from source systems. To use for ingestion, store, and process the new data sources, typically requires a considerable amount of planning, which is one of the challenges of data pipeline integration. For example, Cloudera Morphlines is an open-source framework that reduces the time and skills required to build or change Search indexing applications. A morphline is a rich configuration file that simplifies defining an ETL transformation chain. Use these chains to consume any kind of data from any data source, process the data, and load the results into Cloudera Search. Executing in a small, embeddable Java runtime system, morphlines can be used for near real-time applications as well as batch processing applications.
... View more
07-04-2022
04:37 PM
Hi @Jessica_cisco can you try guava version 14.0.1 from the group com.google.guava? compile group: 'com.google.guava', name: 'guava', version: '14.0.1' You can add the above dependence in your build.gradle and try again?
... View more
07-04-2022
04:03 PM
Hi @naymar, to grant Livy the ability to impersonate the originating user, add the following property to <HADOOP_HOME>/etc/hadoop/core-site.xml: <property>
<name>hadoop.proxyuser.livy.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.livy.hosts</name>
<value>*</value>
</property> Ref: https://docs.cloudera.com/cdp-private-cloud-base/7.1.7/configuration-properties/topics/cm_props_cdh710_coreconfiguration.html https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.5.3/bk_command-line-installation/content/grant_livy_impersonate.html
... View more
06-30-2022
07:27 AM
Hi @suri789 these both are different values, I didn't see any duplicate in these. so plainfield s plainfiled Also from the output, I didn't see any duplicate values, all are distinct by the values..! +----------------+ | value | +----------------+ | s plaindield| | n plainfield| | west home land| | newyork| | so plainfield| |north plainfield| +----------------+ Please note: "n plainfield & north plainfield or s plainfield & so plainfield" are different values, because we didn't write any custom logic like 'n' means 'north' or 's' means 'so'.
... View more
06-28-2022
04:11 PM
Hi @ajaybabum, Yes we can able run Spark in local mode against the Kerberized cluster. For a quick test, can you directly open spark-shell to try reading the CSV file from the HDFS location and show the output of the contents to verify whether do you have any issue in the Cluster / Spark configuration or if it's more on your application code? >> Will it possible in local mode without run kinit command before spark-submit. -- By passing --keytab --principal details in your spark-submit, you don't need to run kinit command before spark-submit. Thanks
... View more
06-28-2022
04:05 PM
Hi @NaniSK, Please can you reach out to the Cloudera Certification Team at certification@cloudera.com regarding any feedback and/or concerns about your certificate and license. Thanks.
... View more
06-28-2022
03:31 PM
Hi @dfdf, I tried in my cluster with both Spark2 and Spark3 on the same version which you tried but I can able to get the results without any issues. Spark2: 2.4.7.7.1.7.1000-141 Spark3 : 3.2.1.3.2.7171000.1-1 Are you still seeing this issue? Please can you share the reproduce steps that I can try from my side to reproduce this issue in my cluster? Thanks
... View more
06-27-2022
03:16 PM
Hi @sss123, this seems to be a bug. Please refer to https://issues.cloudera.org/browse/LIVY-3. Kindly note that Spark Notebook is not currently supported. Also please review the discussion in https://github.com/cloudera/hue/issues/254
... View more
06-27-2022
07:46 AM
Hi @ds_explorer, it seems because the edit log is too big and cannot be read by NameNode completely on the default/configured timeout. 2022-06-25 08:32:24,872 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 554705629. Expected transaction ID was 60366342312 Recent opcode offsets: 554704754 554705115 554705361 554705629 ..... Caused by: java.io.IOException: Premature EOF from inputStream at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:203) at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$LengthPrefixedReader.decodeOpFrame(FSEditLogOp.java:4488) To fix this, can you add the below parameter and value (if you already have then kindly increase the value) HDFS > Configuration > JournalNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml hadoop.http.idle_timeout.ms=180000 And then restart the required services.
... View more