About vtpcnk

vtpcnk · ‎04-15-2021

Hi, we are trying to backup a kudu table as below : spark2-submit --principal <user> --keytab <keytab> --master yarn --deploy-mode cluster --queue <queue> --executor-memory 12G --executor-cores 4 --driver-memory 4G --driver-cores 1 --class org.apache.kudu.backup.KuduBackup kudu-backup2_2.11-1.13.0.7.1.5.0-257.jar --kuduMasterAddresses $KUDU_MASTERS --rootPath hdfs:///backups --forceFull true impala::<table> And it is super slow. Any suggestions on how to make it run faster? Appreciate the feedback.

vtpcnk · ‎04-06-2021

BTW files with ".tmp" extension could be under any subdirectory under "/backups". Thanks.

vtpcnk · ‎04-06-2021

Hi, In BDR HDFS replication I want to exclude all files which end with ".tmp" in the directory "/backups/" Appreciate it if somebody could give the expression for this to add in BDR "Add Exclusion". Thanks

vtpcnk · ‎04-16-2020

We are trying to implement alerting in our cluster and alerting is setup in Cloudera Manager. So when I stop a service in Cloudera Manager, an alert is sent to my email. Because for some reason I hear that if you stop the service from CM, it is not the same as it crashing on its own. Especially with regards Canary Alerts, which we will not get if we stop a service through Cloudera Manager. So will I not get Canary Alerts for a service if the service is stopped through Cloudera Manager? Also I would like to know how to stop a service manually through Cloudera API. I would appreciate it if some forum member could give the command to stop - say Oozie or HBase - through Cloudera Manager API. Appreciate the help.

vtpcnk · ‎03-11-2019

@Kuldeep Kulkarni, there are many lines with "input data" in the page you referred - not sure which ones to ignore. Should I ignore the sections for datasets/input events/output events - that will leave only the workflow section. Is that right? Can't I use the coordinator from your shell action example? But in that I don't see : "<app-path>${workflowAppUri}</app-path>" Appreciate the clarification.

vtpcnk · ‎03-08-2019

@Kuldeep Kulkarni , does your example : https://community.hortonworks.com/articles/27497/oozie-coordinator-and-based-on-input-data-events.html set the job to run once a day? If not, can you please let me know how to do that? I want to run a job once daily. Thanks.

vtpcnk · ‎03-08-2019

BTW now I use cloudera. So I guess it has to be manual. Thanks.

vtpcnk · ‎03-08-2019

Hi Kuldeep, thanks so much for the clarification. I will try to do as per your instructions and let you know how it went. Thanks again.

vtpcnk · ‎03-08-2019

@Kuldeep Kulkarni, I created python action based on https://community.hortonworks.com/content/supportkb/151119/how-to-run-a-python-script-using-oozie-shell-actio.html But how do I integrate coordinator.xml with that? I tried creating the file but it is not executing as per that. Is there somewhere in job.properties or workflow.xml that you mention coordinator.xml? Appreciate the feedback.

vtpcnk · ‎03-06-2019

yes, I checked the nodes and found the output in one of them. I reran to make sure. So the script is not really needed on the Linux box? all of them - job.properties, the shellscript, workflow.xml, coordinator.xml? They need to be only in hdfs? Also next, how to execute a python code from oozie? Also I want it to run daily. Appreciate the insights.

Online	Offline
Last Visited	‎05-05-2021 09:04 AM

Member Since	‎11-24-2015 05:10 AM
Last Visited	‎05-05-2021 09:04 AM
Posts	223
Kudos received	10

Cloudera Community

Kudu backup performance

Re: BDR - HDFS - exclusion

BDR - HDFS - exclusion

Cloudera Manager alerting and API

Re: oozie shell action

Re: oozie shell action

Re: oozie shell action

Re: oozie shell action

Re: oozie shell action

Re: oozie shell action