About jagadeesan

jagadeesan · ‎04-09-2020

Hi @drgenious Are you getting a similar error which reported in KUDU-2633 It seems this is open JIRA reported in the community ERROR core.JobRunShell: Job DEFAULT.EventKpisConsumer threw an unhandled Exception: org.apache.spark.SparkException: Job aborted due to stage failure: Aborting TaskSet 109.0 because task 3 (partition 3) cannot run anywhere due to node and executor blacklist. Blacklisting behavior can be configured via spark.blacklist.*. If you have the data in HDFS in (csv/avro/parquet) format, then you can use the below command to import the files to Kudu table. Prerequisites: Kudu jar with compatible version (1.6 or higher) For more reference spark2-submit --master yarn/local --class org.apache.kudu.spark.tools.ImportExportFiles <path of kudu jar>/kudu-spark2-tools_2.11-1.6.0.jar --operation=import --format=<parquet/avro/csv> --master-addrs=<kudu master host>:<port number> --path=<hdfs path for data> --table-name=impala::<table name> Hope this helps. Please accept the answer and vote up if it did.

jagadeesan · ‎04-03-2020

Hi @Mondi The important differences between parcels and packages are: Parcels are self-contained and installed in a versioned directory, which means that multiple versions of a given parcel can be installed side-by-side. You can then designate one of these installed versions as the active one. With packages, only one package can be installed at a time so there is no distinction between what is installed and what is active. You can install parcels at any location in the filesystem. They are installed by default in /opt/cloudera/parcels. In contrast, packages are installed in /usr/lib. When you install from the Parcels page, Cloudera Manager automatically downloads, distributes and activates the correct parcel for the operating system running on each host in the cluster. Note: You cannot install software using both parcels and packages in the same cluster. Because of their unique properties, parcels offer more advantages over packages, for more details please refer here Hope this helps. Please accept the answer and vote up if it did. Regards,

jagadeesan · ‎03-31-2020

Hi @sppandita85BLR Currently, there is no documented procedure to migrate from HDP. In these cases, it's best to engage with your local Cloudera account rep and professional services. They may help you with the runbook to do the migration or any other feasibility. Hope this helps. Please accept the answer and vote up if it did. Regards,

jagadeesan · ‎03-27-2020

Hi @rajisridhar You can use a command like this to get the start and end time and then store it where you wish to or configure mail accordingly to your requirements. Example: $ oozie job -oozie http://localhost:11000/oozie -info 14-20090525161321-oozie-joe . .---------------------------------------------------------------------------------------------------------------------------------------------------------------- Workflow Name : map-reduce-wf App Path : hdfs://localhost:8020/user/joe/workflows/map-reduce Status : SUCCEEDED Run : 0 User : joe Group : users Created : 2009-05-26 05:01 +0000 Started : 2009-05-26 05:01 +0000 Ended : 2009-05-26 05:01 +0000 Actions .---------------------------------------------------------------------------------------------------------------------------------------------------------------- Action Name Type Status Transition External Id External Status Error Code Start End .---------------------------------------------------------------------------------------------------------------------------------------------------------------- hadoop1 map-reduce OK end job_200904281535_0254 SUCCEEDED - 2009-05-26 05:01 +0000 2009-05-26 05:01 +0000 .---------------------------------------------------------------------------------------------------------------------------------------------------------------- For detailed information see this: https://oozie.apache.org/docs/3.3.2/DG_CommandLineTool.html#Jobs_Operations View solution in original post Hope this helps. Please accept the answer and vote up if it did.

jagadeesan · ‎03-27-2020

Hi @JasmineD, We might need to consider backing up the following: flow.xml.gz users.xml authorizations.xml All config files in NiFi conf directory NiFi local state from each node NiFi cluster state stored in zookeeper. Please make sure that you have stored the configuration passwords safely. NiFi relies on sensitive.props.key password to decrypt sensitive property values from flow.xml.gz file. If they do not know sensitive props key, they would need to manually clear all encoded values from flow.xml.gz. This action will clear all passwords in all components on the canvas. We need to re-enter all of them once NiFi was recovered. Also, if there are any local files that are required by the DataFlows, that would also need to be backed up as well. (i.e., Custom processor jars, user-built scripts, externally referenced config/jar files used by some processors, etc.). Note: All the repositories in NiFi are backed up by default. Here is a good article to see how backup works in NiFi. https://community.cloudera.com/t5/Community-Articles/Understanding-how-NiFi-s-Content-Repository-Archiving-works/ta-p/249418 Hope this helps. Please accept the answer and vote up if it did.

jagadeesan · ‎11-24-2019

Hi @anshuman Yes, we have Node labels support in the new CDP. For more details, you can check CDP documentations. https://docs.cloudera.com/ -> Cloudera Data Platform -> Runtime -> Cloudera Runtime https://docs.cloudera.com/runtime/7.0.2/yarn-allocate-resources/topics/yarn-configuring-node-labels.html FYI. Cloudera Runtime is the core open-source software distribution within Cloudera Data Platform (CDP) that is maintained, supported, versioned, and packaged as a single entity by Cloudera. Cloudera Runtime includes approximately 50 open source projects that comprise the core distribution of data management tools within CDP, including Cloudera Manager, which is used to configure and monitor clusters managed in CDP.

jagadeesan · ‎01-15-2019

@Michael Bronson Decommissioning is a process that supports removing components and their hosts from the cluster. You must decommission a master or slave running on a host before removing it or its host from service. Decommissioning helps you to prevent potential loss of data or disruption of service. Below HDP documentation for Ambari-2.6.1 help you to decommission a DataNode. When DataNode decommissioning process is finished, the status display changes to Decommissioned. https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.1.0/bk_ambari-operations/content/how_to_decommission_a_component.html I hope that the above answers your questions.

jagadeesan · ‎01-14-2019

@Michael Bronson Below article help you to replacing faulty disks on datanode. https://community.hortonworks.com/articles/3131/replacing-disk-on-datanode-hosts.html Please accept the answer you found most useful.

jagadeesan · ‎01-02-2019

@Suraj Singh Seems it's similar like what we discussed in below thread https://community.hortonworks.com/questions/232093/yarn-jobs-are-getting-stuck-in-accepted-state.html If resubmit jobs will get success ? As discussed earlier this is open bug which fixed in further releases. If you need to apply patch, please involve Hortonworks support. If you are a customer, HWX can release a patch for you if it's technically possible based on specifics of the JIRAs. If you don't have support, you can certainly do it but test it first apply the patch in dev/test and see if it resolves your problem.

jagadeesan · ‎01-01-2019

@Michael Bronson Please can you check the zookeeper logs (/var/log/zookeeper) of master1.sys89.com. This can happen if there are too many open connections. Check where there were any warning messages stating with “Too many connections from {IP address of master1.sys89.com}”. Using netstat command also you can verify netstat -no | grep :2181 | wc -l To fix this issue, kindly clear up all stale connections manually or try increasing the maxClientCnxns setting at /etc/zookeeper/2.6.4.0-91/0/zoo.cfg. From your zoo.cfg file I can see value is maxClientCnxns=60 which is default. You can increase it by adding the maxClientCnxns=4096 and restart respective affected services.

Online	Offline
Last Visited	‎10-03-2025 07:58 AM

Member Since	‎11-12-2018 10:00 AM
Last Visited	‎10-03-2025 07:58 AM
Posts	218
Kudos received	179

Cloudera Community

Re: Migrating workloads from Spark 2 to Spark 3

Re: Looking for a supported version of Spark 3 for...

Re: Spark 3 Parcel Compatibility with CDP Private ...

Re: Apache Storm support in Cloudera

Re: Complete example for using spark MLlib for twi...

Re: Transfer files from hdfs to kudu

Re: Difference between upgrading using Parcels vs ...

Re: HDP to Cloudera Migration

Re: get oozie workflow name with start and end tim...

Re: Best practice for Nifi env backup

Re: Node Label Support in CDP

Re: what are the steps in order to replace faulty ...

Re: what are the steps in order to replace faulty ...

Re: yarn job stuck in accepted state randomly with...

Re: zookeeper-client cannot connect to zookeeper ...