Member since
11-12-2018
192
Posts
177
Kudos Received
32
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
596 | 04-26-2024 02:20 AM | |
789 | 04-18-2024 12:35 PM | |
3453 | 08-05-2022 10:44 PM | |
3184 | 07-30-2022 04:37 PM | |
6922 | 07-29-2022 07:50 PM |
07-03-2022
10:22 PM
@dfdf, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. If you are still experiencing the issue, can you provide the information @jagadeesan has requested?
... View more
06-30-2022
07:27 AM
Hi @suri789 these both are different values, I didn't see any duplicate in these. so plainfield s plainfiled Also from the output, I didn't see any duplicate values, all are distinct by the values..! +----------------+ | value | +----------------+ | s plaindield| | n plainfield| | west home land| | newyork| | so plainfield| |north plainfield| +----------------+ Please note: "n plainfield & north plainfield or s plainfield & so plainfield" are different values, because we didn't write any custom logic like 'n' means 'north' or 's' means 'so'.
... View more
06-29-2022
11:21 AM
1 Kudo
Thank you for providing the info @jagadeesan I emailed the Cloudera Certification group. Best, Sruthi Kumar
... View more
06-28-2022
05:07 PM
Hi @Ane Please can you give some small examples with sample input and output for a better understanding of the problem statement? Thanks
... View more
06-28-2022
04:11 PM
Hi @ajaybabum, Yes we can able run Spark in local mode against the Kerberized cluster. For a quick test, can you directly open spark-shell to try reading the CSV file from the HDFS location and show the output of the contents to verify whether do you have any issue in the Cluster / Spark configuration or if it's more on your application code? >> Will it possible in local mode without run kinit command before spark-submit. -- By passing --keytab --principal details in your spark-submit, you don't need to run kinit command before spark-submit. Thanks
... View more
06-28-2022
02:07 PM
1 Kudo
Hi @NicolasMarcos, Thank you for expressing your interest in downloading the Cloudera Quickstart VM. But unfortunately, the Cloudera Quickstart VM has been discontinued. You can try the docker image of Cloudera available publicly on https://hub.docker.com/r/cloudera/quickstart or simply run the below command to download this on docker enabled system. docker pull cloudera/quickstart Please note, that Cloudera doesn't support QuickStart VM Officially and it's deprecated. The up-to-date product is Cloudera Data Platform, and you can download a trial version to install on-premises here.
... View more
06-27-2022
09:33 AM
@haze5736 Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks
... View more
06-27-2022
07:46 AM
Hi @ds_explorer, it seems because the edit log is too big and cannot be read by NameNode completely on the default/configured timeout. 2022-06-25 08:32:24,872 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode. org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 554705629. Expected transaction ID was 60366342312 Recent opcode offsets: 554704754 554705115 554705361 554705629 ..... Caused by: java.io.IOException: Premature EOF from inputStream at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:203) at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$LengthPrefixedReader.decodeOpFrame(FSEditLogOp.java:4488) To fix this, can you add the below parameter and value (if you already have then kindly increase the value) HDFS > Configuration > JournalNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml hadoop.http.idle_timeout.ms=180000 And then restart the required services.
... View more
04-10-2021
12:53 AM
1 Kudo
Hi @ryu, then you might need to build some customize in-house monitoring scripts using Yarn APIs or other tools like Prometheus or Grafana for your use case. Please also refer to the below links for more insights https://www.programmersought.com/article/61565532790/ http://rokroskar.github.io/monitoring-spark-on-hadoop-with-prometheus-and-grafana.html https://www.linkedin.com/pulse/how-monitor-yarn-application-via-restful-api-wayne-zhu/
... View more
02-23-2021
01:44 AM
Thanks @adrijand for sharing your updates, it's highly appreciated.
... View more