Created on 05-08-2017 09:06 AM - edited 08-17-2019 01:06 PM
In this article I will be explaining about creating oozie custom action node to clone a git repository to a required file path.
Git repo location and file path will be taken as input while running the workflow.
And we will use ambari's workflow manager view to create a workflow using newly created action node. If you are new to workflow manager, then this would be a good starting point.
Details about custom action nodes could be found in oozie docs here.
We need to follow some prerequisites before we jump to workflow manager view and start using custom action nodes.
Step 1 : Implementing Oozie custom action handler : For this article, I will be creating a custom action which clones a git repository . This implementation extends ActionExecutor class (provided by Oozie) and overrides the required methods. This implementation follows Oozie documentation and implements all required methods.
import java.io.File; import org.apache.oozie.ErrorCode; import org.apache.oozie.action.ActionExecutor; import org.apache.oozie.action.ActionExecutorException; import org.apache.oozie.action.ActionExecutorException.ErrorType; import org.apache.oozie.client.WorkflowAction; import org.apache.oozie.util.XmlUtils; import org.eclipse.jgit.api.Git; import org.jdom.Element; import org.jdom.Namespace; public class GitActionExecutor extends ActionExecutor { private static final String NODENAME = "git"; private static final String SUCCEEDED = "OK"; private static final String FAILED = "FAIL"; private static final String KILLED = "KILLED"; public GitActionExecutor() { super(NODENAME); } @Override public void check(Context context, WorkflowAction action) throws ActionExecutorException { // Should not be called for synch operation throw new UnsupportedOperationException(); } @Override public void end(Context context, WorkflowAction action) throws ActionExecutorException { String externalStatus = action.getExternalStatus(); WorkflowAction.Status status = externalStatus.equals(SUCCEEDED) ? WorkflowAction.Status.OK : WorkflowAction.Status.ERROR; context.setEndData(status, getActionSignal(status)); } @Override public boolean isCompleted(String arg0) { return true; } @Override public void kill(Context context, WorkflowAction action) throws ActionExecutorException { context.setExternalStatus(KILLED); context.setExecutionData(KILLED, null); } @Override public void start(Context context, WorkflowAction action) throws ActionExecutorException { // Get parameters from Node configuration try { Element actionXml = XmlUtils.parseXml(action.getConf()); Namespace ns = Namespace .getNamespace("uri:custom:git-action:0.1"); String repository = actionXml.getChildTextTrim("repository", ns); File filePath = new File(actionXml.getChildTextTrim("hdfsPath", ns)); cloneRepo(repository, filePath); context.setExecutionData(SUCCEEDED, null); } catch (Exception e) { context.setExecutionData(FAILED, null); throw new ActionExecutorException(ErrorType.FAILED, ErrorCode.E0000.toString(), e.getMessage()); } } // Sending an email public void cloneRepo(String repository, File filePath) throws Exception { Git.cloneRepository() .setURI(repository) .setDirectory(filePath) .call(); } }
Step2 : Define XML schema for newly created email component :
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:git="uri:custom:git-action:0.1" elementFormDefault="qualified" targetNamespace="uri:custom:git-action:0.1"> <xs:complexType name="GIT"> <xs:sequence> <xs:element name="repository" type="xs:string" /> <xs:element name="filePath" type="xs:string" /> </xs:sequence> </xs:complexType> <xs:element name="git" type="git:GIT"></xs:element> </xs:schema>
This takes 'repository' and 'filePath' as mandatory input value before running the workflow.
Step 3 : Register information about custom executor with Oozie runtime. This is done by extending oozie-site.xml
Add oozie.service.ActionService.executor.ext.classes=GitActionExecutor
Step 4 : Add XML schema for the new Actions.
Add oozie.service.SchemaService.wf.ext.schemas=gitAction.xsd
Step 5 : Package action code and XML schema into a single jar file and upload the jar file to location '/usr/hdp/current/oozie-server/libext' and restart oozie server.
Step 6 : Now navigate to workflow manager view and open a new workflow window.
Step 7 : Select new custom action node from action node list.
Step 8 : Define the custom XML with path to git repository and file path where repository should be cloned.
<git xmlns="uri:custom:git-action:0.1"> <repository>https://github.com/cartershanklin/structor.git</repository> <filePath>/tmp/newDir/</filePath> </git>
Step 9 : Preview the workflow xml and see that workflow is created with custom action node 'git'.
Step 10 : As everything is setup, we can now run this workflow. On successful completion of the workflow run, git repo will be cloned to given input path /tmp/newDir/
This project could be downloaded from https://github.com/ssharma555/oozie-git-clone.git
Reference : https://www.infoq.com/articles/ExtendingOozie