About cconner

cconner · ‎02-18-2014

Thanks, can you follow the steps I provided for CM and see if that helps?

cconner · ‎02-18-2014

Hey, Couple of things: 1. This section is not necessary, it doesn't hurt anything, but you don't need it: <configuration> <property> <name>oozie.hive.defaults</name> <value>/user/someuser/hive-default.xml</value> </property> </configuration> 2. What version of CDH are you using? 3. Are you using CM? It really looks like you still don't have the credentials configured in the oozie-site.xml. If you are using CM, that might be why, can you chec, the instructions below and configure the oozie-site.xml for credentials? For the Oozie configuration: 1. If using CM, add an Oozie proxy to the core-site.xml for the Hive metastore server: - Go to, "HDFS Service->Configuration->Service-Wide->Advanced-> Cluster-wide Configuration Safety Valve for core-site.xml" 2. Add: <property> <name>hadoop.proxyuser.oozie.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.oozie.groups</name> <value>*</value> </property> 3. Restart the Hive metastore server. 4. Add the HCat credentials class to oozie-site.xml. Edit the file and add: <property> <name>oozie.credentials.credentialclasses</name> <value>hcat=org.apache.oozie.action.hadoop.HCatCredentials</value> </property> 5. In addition, If using CM, go to "Oozie service->Configuration->Oozie Server(default)->Advanced-> Oozie Server Configuration Safety Valve for oozie-site.xml" 6. Add: <property> <name>oozie.credentials.credentialclasses</name> <value>hcat=org.apache.oozie.action.hadoop.HCatCredentials</value> </property> 7. Restart Oozie.

cconner · ‎02-04-2014

Hello, Here is an example if you are using a Driver class: Steps: Pull the Source code for the new PiEstimator and compile with Maven. Requires Git, Maven and Java: git clone https://github.com/cmconner156/oozie_pi_load_test.git cd oozie_pi_load_test/PiEstimatorKrbSrc vi pom.xml set hadoop-core and hadoop-client to match your version. mvm clean install Copy oozie_pi_load_test/PiEstimatorKrbSrc/target/PiEstimatorKrb-1.0.jar to some location in HDFS. Make sure it's readable by whichever Hue user will run the workflow. Go to Hue browser and go to the Oozie app Go to the Workflows tab Click "Create" Enter a name and description Click Save Drag "Java" from the actions above to the slot between "Start" and "end" Give it a name and description For the Jar name, click the browse button Find the PiEstimatorKrb-1.0.jar file you put in HDFS For "Main Class" enter "com.test.PiEstimatorKrb" For "Arguments" enter "<tempdir> <nMaps> <nSamples>" by replacing those with correct values. For example "/user/cconner/pi_temp 4 1000", base the nMaps and nSamples on what you would normally use for the Pi example. Click "add path" next to "Files" and search for PiEstimatorKrb-1.0.jar in HDFS. Click Done. Click Save. Click Submit on the left. Here is an example not using a driver class: Steps: Put /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar in HDFS somewhere, I put it in /user/oozie and make it readable by everyone. Create a directory in HDFS for the job. I did "hadoop fs -mkdir teragen_oozie" Create an empty input directory in HDFS for the job. I did "hadoop fs -mkdir teragen_oozie/input" Go into Hue->Oozie and click Create workflow. Enter Name, Description, "HDFS deployment directory" and set it to the location above Click Save Click + button for Mapreduce Enter a name for the MR task For Jar name, browse to the location where you put hadoop-mapreduce-examples.jar above Click "Add Property" for Job Properties and add the following: mapred.input.dir = hdfs://cdh412-1.test.com:8020/user/admin/teragen_oozie/input mapred.output.dir = hdfs://cdh412-1.test.com:8020/user/admin/teragen_oozie/output mapred.mapper.class = org.apache.hadoop.examples.terasort.TeraGen$SortGenMapper terasort.num-rows = 500 Click "Add delete" for Prepare and specify "hdfs://cdh412-1.test.com:8020/user/admin/teragen_oozie/output" as the location. Click Save. Now run the Workflow and it should succeed NOTE: change cdh412-1.test.com:8020 to be the correct NN for your environment. Hope this helps!

cconner · ‎02-04-2014

No problem! The workflow and coordinator definitions are in the Hue DB and the xml for them are in HDFS, just none of the status history, that Hue pulls from Oozie. Everything you need to resubmit the coordinators and workflows is in the Hue DB and HDFS and clearing the Oozie DB won't cause any problems there. Thanks Chris

cconner · ‎02-04-2014

These steps with Razor SQL will get you the historical data, but you would still need to resubmit the jobs: 1. Stop Oozie. 2. Backup the original Derby DB file somewhere. Copy /var/lib/oozie/data/ to a backup location. 3. Gather the info for connecting to the Derby DB. On the Oozie server run the following grep command. It will give you all the DB connect info: grep -A1 JPA "/var/run/cloudera-scm-agent/process/`ls -alrt /var/run/cloudera-scm-agent/process/ | grep OOZIE | tail -1 | awk '{print $9}'`/oozie-site.xml" <name>oozie.service.JPAService.jdbc.driver</name> <value>org.apache.derby.jdbc.EmbeddedDriver</value> <name>oozie.service.JPAService.jdbc.username</name> <value>sa</value> <name>oozie.service.JPAService.jdbc.password</name> <value></value> <name>oozie.service.JPAService.jdbc.url</name> <value>jdbc:derby:/var/lib/oozie/data;create=true</value> 4. Install RazorSQL 5. Make a copy of your of the Derby DB files for Oozie in the path mentioned in step 2 above and copy to the host with RazorSQL. 6. From RazorSQL do the following: Menu/Option Action/Value a. Connections Add Connection Profile b. Derby -(continue) c. Profile Name: Oozie DB d. Database Directory Point to the directory with the Oozie DB copy mentioned in step 5. e. Click connect f. DB Tools Export Data g. Check Multiple tables h. Enter schema as "SA" i. Generate SQL Statements j. Generate SQL INSERT statements k. Do not export the DDLS!!! l. Export to single file, \ for escape single quotes, <SEMI-COLON> SQL statement separator m. Select a filename (i.e. oozie.sql), then Save 8. Edit the resulting oozie.sql and replace "SA." with the name of the DB in your new database. 9. Verify the oozie.sql looks good. 10. In CM, reconfigure Oozie to point to the new DB. 11. In CM, run "Create Database" from the Actions drop down within the Oozie service. 12. Import the oozie.sql or oozie-processed.sql above.

cconner · ‎02-04-2014

Hey, Unfortunately this is something I've never been able to get working. The problem is Oozie stores the workflows in the DB as blobs, as a result, that makes migration very complex. The data in Oozie is the status of all the past workflows that have not been purged by the purge process and then info on all the current running oozie jobs. So if you resubmit the jobs, you will just lose status information about old jobs. If you can afford to lose the historical data and then resubmit the jobs, I would strongly recommend going that route. If not, you can take a look at Razor SQL. It might be able to do it, however, it's not free, but there is a free trial for a trial run. Hope this helps.

cconner · ‎01-16-2014

Hey, The /user/ directory is owned by "hdfs" with 755 permissions. As a result only hdfs can write to that directory. Unlike unix/linux, hdfs is the superuser and not root. So you would need to do this: sudo -u hdfs hadoop fs -mkdir /user/,,myfile,, sudo -u hdfs hadoop fs -put myfile.txt /user/,,/,, If you want to create a home directory for root so you can store files in his directory, do: sudo -u hdfs hadoop fs -mkdir /user/root sudo -u hdfs hadoop fs -chown root /user/root Then as root you can do "hadoop fs -put file /user/root/". Hope this helps. Chris

cconner · ‎01-13-2014

When you added the hadoop.proxyuser.oozie.* values, did you restart MapReduce and HBase? They have to be restarted to notice that change. Also, are you using CM? It doesn't look like it, but I wanted to confirm because that would change things. Thanks Chris

cconner · ‎11-04-2013

Hello, The Hive version you have installed is correct for CDH 4.3. I'm not sure why that docs says HS2 was introduced in Hive version 0.11. However, HiveServer2 has been available in CDH since CDH 4.1.2. HS2 in CDH 4.3 should work fine. We definitely do not recommend running a different version of Hive with any version of CDH. All the versions included in a CDH release have been thoroughly tested together and confirmed to work together. Have you opened tickets for the issues you are seeing? Thanks Chris

cconner · ‎10-27-2013

Hey Andrey, Somehow I thought this forum was strictly Oozie:-). Sorry for the confusion, this is the right place for MR, maybe one of the MR folks can chime in on the MR side of this question... I will still take a look around and see if I can come up with something. Thanks Chris

Online	Offline
Last Visited	‎01-09-2020 02:19 PM

Member Since	‎07-31-2013 08:23 AM
Last Visited	‎01-09-2020 02:19 PM
Posts	98
Kudos received	54

Cloudera Community

Re: Hue: Can super user status be granted to an LD...

Re: Oozie (latest build) - Hive action - Looking ...

Re: MapReduceIndexerTool with Kerberos

Re: Enable superuser in hue with desktop.auth.back...

Re: Oozie shell action - sqoop cant find hive-site...

Re: Error while running hive script from oozie wor...

Re: Error while running hive script from oozie wor...

Re: Configure an oozie mapreduce action

Re: Oozie embedded derby to mysql, what is the bes...

Re: Oozie embedded derby to mysql, what is the bes...

Re: Oozie embedded derby to mysql, what is the bes...

Re: Permission denied: user=root, access=WRITE, in...

Re: Oozie-Error: E0501: User: oozie is not allowed...

Re: Please confirm the hive version with CM4.6.3

Re: Jobtracker HA and oozie