Member since
07-31-2013
1924
Posts
462
Kudos Received
311
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1969 | 07-09-2019 12:53 AM | |
| 11879 | 06-23-2019 08:37 PM | |
| 9143 | 06-18-2019 11:28 PM | |
| 10127 | 05-23-2019 08:46 PM | |
| 4577 | 05-20-2019 01:14 AM |
07-27-2015
03:03 AM
1 Kudo
Yes, that will work if you use a map-reduce action type to define that configuration property. If you are using a java action type instead, you will also need to load the configuration in the driver explicitly: http://archive.cloudera.com/cdh5/cdh/5/oozie/WorkflowFunctionalSpec.html#a3.2.7_Java_Action """ A java action can create a Hadoop configuration for interacting with a cluster (e.g. launching a map-reduce job). Oozie prepares a Hadoop configuration file which includes the environments site configuration files (e.g. hdfs-site.xml, mapred-site.xml, etc) plus the properties added to the section of the java action. The Hadoop configuration file is made available as a local file to the Java application in its running directory. It can be added to the java actions Hadoop configuration by referencing the system property: oozie-action.conf.xml . For example: // loading action conf prepared by Oozie Configuration actionConf = new Configuration(false); actionConf.addResource(new Path("file:///", System.getProperty("oozie.action.conf.xml"))); If oozie.action.conf.xml is not added then the job will pick up the mapred-default properties and this may result in unexpected behaviour. For repeated configuration properties later values override earlier ones. """
... View more
07-24-2015
06:34 PM
I've posted a reply on your other thread opened for this topic: http://community.cloudera.com/t5/Batch-Processing-and-Workflow/Cloudera-5-4-x-Oozie-Custom-Action-python-to-configure-classes/m-p/29952 Lets carry on there, and mark this thread resolved (for benefit of others looking for the same thing)?
... View more
07-24-2015
12:54 AM
The option is useful to ensure/assert you don't already have an existing table and the data should not be appended into an existing table. This is required in some workflows that load staging tables, for example
... View more
07-24-2015
12:15 AM
Checkout the Sqoop User Guide documentation at http://archive.cloudera.com/cdh5/cdh/5/sqoop/SqoopUserGuide.html: | --create-hive-table | If set, then the job will fail if the target hive table exits. By default this property is false. |
... View more
07-23-2015
10:55 PM
Sqoop should be used if you have to import or export data from/to RDBMSs like MySQL/Oracle/etc.; It will allow you to grab data from there and load it into HDFS. If you already have your data in HDFS, then Sqoop may not be what you are looking for (unless you want to write data back to RDBMSs).
... View more
07-23-2015
10:56 AM
2 Kudos
Please restart the failed services. Note that HBase, being dependent on ZK timeouts for liveliness, may sometimes not survive VM pauses (such as when you hibernate your machine, etc.), and would need to be restarted before use again. This isn't a problem on actual clusters that don't experience such pauses/gaps in machine availability.
... View more
07-23-2015
05:58 AM
It should be possible to perform SecureBulkLoad without enabling Kerberos, although I have not personally tested this mix. The config steps may involve just configuring the SecureBulkLoad end-point and the staging directory configs on the RSes and clients.
... View more
07-23-2015
05:54 AM
We carry a page in our regular documentation that maps every field you see in CM to their CM API variant names. This page can be found, for the Oozie service for example, at http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_props_cdh540_oozie.html For your specific two properties, the name mapping can be found by looking at the above page as: 1. OozieActionService Executor Extension Classes => oozie_executor_extension_classes 2. OozieSchemaService Workflow Extension Schemas => oozie_workflow_extension_schemas So, following the updating config example at http://cloudera.github.io/cm_api/docs/python-client/#configuring-services-and-roles, but applying for your Oozie properties, the code would roughly look like the below: # Get a handle to the API client
from cm_api.api_client import ApiResource
cm_host = "cm-host"
api = ApiResource(cm_host, username="admin", password="admin")
# Get cluster
cdh = None
for c in api.get_all_clusters():
print c.name
if c.version == "CDH5":
cdh = c
# Get service
oozie = None
for s in cdh.get_all_services():
print s
if s.type == "OOZIE":
oozie = s
oozie.update_config({'oozie_executor_extension_classes': 'com.mycompany.MyClass', 'oozie_workflow_extension_schemas': 'my-class.xsd'}) Does this help?
... View more
07-23-2015
05:45 AM
You have a couple of options at least: 0. Import from Sqoop directly into the required file format and/or table, instead of just delimited text. Sqoop supports Text, Sequence, Avro and Parquet formats. 2. Import into a new (temporary) table thats created with the delimited format specifiers. Load the data into this table via LOAD DATA LOCAL INPATH statement, then use INSERT INTO statement to move the data into the original table that uses a different file format.
... View more