Member since
07-08-2013
35
Posts
19
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
7967 | 03-14-2016 09:34 PM | |
2696 | 02-05-2016 12:11 PM | |
6807 | 08-12-2015 11:44 AM | |
2073 | 08-14-2014 10:14 AM | |
4502 | 10-14-2013 11:53 AM |
12-19-2017
10:09 AM
1 Kudo
Try running curl http://oozie_host:oozie_port/oozie/v2/admin/status
... View more
03-14-2016
09:34 PM
1 Kudo
Hi, You can use the SSH Action to execute shell scripts on specific nodes. See http://archive.cloudera.com/cdh5/cdh/5/oozie/DG_SshActionExtension.html And here's a very detailed example of using the SSH Action: http://hadooped.blogspot.com/2013/10/apache-oozie-part-13-oozie-ssh-action_30.html
... View more
02-25-2016
12:53 PM
I'm not sure why that's not working. I can see that guava 16 is being passed to Spark and guava 14 isn't there (FYI: you also replaced guava 11 from Hadoop with 16, which may cause problems for Hadoop). Can you try yarn-client or yarn-cluster mode instead of local? My understanding is that local mode doesn't always work right.
... View more
02-25-2016
12:08 PM
As I said, you should look at the stdout of the Launcher Job. Can you post the stdout, stderr, and syslogs from the Launcher Job somewhere and link to them here? It contains a log of useful information and might help narrow down your classpath problem.
... View more
02-25-2016
11:55 AM
As I said, you should look at the stdout of the Launcher Job. Can you post the stdout, stderr, and syslogs from the Launcher Job somewhere and link to them here?
... View more
02-25-2016
10:51 AM
It sounds like you have the guava 16 jar in the classpath then. If you look at the stdout from the Launcher Job, you should see that it's listed there and that it's being passed to Spark. | 1.built spark-assembly-1.5.3-hadoop2.6.0.jar with guava 16.0.1 by myself | 2.renamed it as | spark-assembly-1.5.0-cdh5.5.0-hadoop2.6.0-cdh5.5.0.jar under | /opt/cloudera/parcels/CDH-5.5.0-1.cdh5.5.0.p0.8/jars That's going to result in all kinds of problems. You must use CDH Spark with CDH. I can't speak to /etc/spark/conf/ classpath.txt (You'll have to ask on the Spark forum). Though w hen run from Oozie, I don't think that Spark uses that file. Keep in mind that what you're trying to do here (replacing jars we're shipping) isn't really supported or tested, so it might not even be possible to do what you want.
... View more
02-24-2016
09:07 PM
Hi, I think you need to re-read those 4 ways a little more carefully. #1 clearly states: There is no need to ever point [oozie.libpath] at the ShareLib location. (I see that in a lot of workflows.) Oozie knows where the ShareLib is and will include it automatically if you set oozie.use.system.libpath=true in job.properties. which is exactly what you tried. Any of the 4 methods should work. (Except for #3: we're currently aware of a known issue where the Spark Action does not allow <file> or <archive> tags; we're planning on fixing that in a later release.) Though keeping in mind that you're trying to replace a jar in the Sharelib, you need to go with #4 anyway, and replace the jar in the Spark Sharelib subdir (which you did already). Though remember that this means any Spark Action that anyone runs will have this modification. If you want to protect other users and workflows from your changes, you can create a new dir in the Sharelib, say "spark_guava_16", and set " oozie.action.sharelib.for.spark" to "spark_guava_16". This is also described in the blog post in the "Overriding the ShareLib" section. If that's not a concern, then you don't need to bother. Please check the following: Run the oozie admin -shareliblist spark command. It will print out a list of the jars from the Spark sharelib directory that Oozie is currently aware of and using. If you did replace the guava 14 jar with the 16 jar there, it should show up in that output. If not, you need to restart the Oozie server or run the oozie admin -sharelibupdate command. Also pay attention to the lib_<timestamp> directory in the output; perhaps you're changing an old directory When you run the job, look at the stdout from the Launcher Job. It prints out a lot of useful information, including the classpath. Do you see the guava 16 or 14 jar there? However, as I said in the email thread in the oozie mailing list, Spark is expecting guava 14. Guava tends to not be very compatible across major versions; so you may encounter other problems if you force it to use guava 16.
... View more
02-05-2016
12:11 PM
We're aware of this issue and there's a fix coming in CDH 5.5.2. In the meantime, you can add the following to your oozie-site.xml (or safety-valve in CM's Oozie configuration) as a workaround: <property>
<name>oozie.email.smtp.password</name>
<value>aaa</value>
</property> The value can be any string (it won't actually be used if oozie.email.smtp.auth is false, the default). For those interested, this is fixed by OOZIE-2365.
... View more
10-19-2015
03:34 PM
1 Kudo
Hi, Oozie only uses ZooKeeper when configured in HA (High Availability). CM only emits oozie.zookeeper.connection.string when HA is enabled, even though the ZooKeeper Service option always showing up in Oozie's Configuration page in CM. And when CM doesn't emit this, Oozie uses the default, which is localhost; but Oozie's not actually using it or talking to ZooKeeper. To investigate your slower job, you should look through the Oozie Server logs and action output in Yarn to figure out what's actually happening.
... View more
08-12-2015
11:44 AM
1 Kudo
What you can do as a workaround, is split up your long-running Coordinators. For example, instead of making your Coordinator run for years? forever?, make it run for, say, 6 months. And have an identical Coordinator scheduled to start exactly when that one ends. This will allow Oozie to cleanup the old child Workflows from that Coordinator every 6 months. Otherwise, you can schedule a cron job to manually delete old jobs from the Database. However, please be careful about this. When deleting a workflow job from the WF_JOBS table, you'll also need to delete the workflow actions from the WF_ACTIONS table that belong to it, as well as the coordinator action from the WF_ACTIONS table that it belongs to. If you miss something, it will likely cause problems.
... View more
08-11-2015
10:44 AM
1 Kudo
Hi, By default, Oozie will not purge child jobs if the parent is not eligible to be purged. In your case, because the Coordinator job is still running, none of the child Workflow jobs will be purged. Which version of CDH are you using? Starting with CDH 5.2.0, you can change it so that Oozie will delete the child jobs even if the parent job is still running. To do that, you can set oozie.service.PurgeService.purge.old.coord.action=true in oozie-site. Also, starting with CM 5.4, the Oozie Configuration page has controls for these configs, so you don't need the safety-valve anymore here.
... View more
06-03-2015
04:54 PM
1 Kudo
By the way, OOZIE-2159 will fix the 'oozie validate' command not supporting custom actions by moving the check from the Oozie client to the Oozie server, where the custom action is configured.
... View more
05-27-2015
10:38 AM
Hi, The Launcher Job runs as Map task in an MR job, and runs the command from there (Sqoop in this case). Sqoop can then launch addiitonal jobs. Because this Launcher Job is sitting around waiting for all of the other jobs to finish, it's actually possible to deadlock the cluster or queue, depending on the size of your cluster and/or your scheduler settings. A good sign that this happened is if you see the "heart beat" message over and over again in the Launcher Job, and one or more ACCEPTED jobs in the RM that are not starting because there are not enough resources. It sounds like that might be happening here. I'm not that familiar with the Capacity Scheduler (we typically recommend the Fair Scheduler), so I can't really advise you on that specifically.
... View more
05-21-2015
10:24 AM
The "timezone" attribute in the Coordiantor is a little misleading. You still have to specify your dates in UTC (hence why they end with 'Z'). The "timezone" attribute is only used for daylight saving time calculations.
... View more
05-21-2015
10:15 AM
1 Kudo
Usually if there is a delay between actions, especially if it's 10 minutes, it means that Oozie didn't receive the callback that the job finished . To avoid having to poll the remote server frequently, Oozie only checks once every 10 minutes (by default). Obviously we want the Oozie server to find out sooner, so the SSH action is configured to issue an http callback to the Oozie server once it's finished to let Oozie know. If that callback gets blocked somehow, then Oozie will take 10 minutes to notice. Can you make sure that the target machine is able to send a GET request to the Oozie server? curl also needs to be installed (the SSH action uses curl to make the GET request).
... View more
05-21-2015
10:09 AM
1 Kudo
Oozie can work with both MR1 and MR2 (YARN), though not at the same time. The VM is configured for MR2 out of the box, but the examples that ship with Oozie target MR1. You can point the example at MR2 by simply changing the "jobTracker" parameter from jobTracker=localhost:8021 (the JT) to jobTracker=localhost:8032 (the RM). Even though it's called "jobTracker", Oozie will use MR2 if you point it at an RM.
... View more
05-21-2015
10:00 AM
Usually if there is a delay between actions, it means that Oozie didn't receive the callback from Hadoop. To not spam the JT/RM with job status requests, Oozie normally only queries it once every ten minutes (by default). The reason you normally don't have to wait 10 minutes is that Oozie configures Hadoop to send back a message to Oozie over http once the job has finished. If that callback doesn't go through for some reason, then you'll typically see a 10 minute delay between actions. Can you check the launcher job logs for any error messages? If it couldn't send the callback, it would probably say something there.
... View more
05-21-2015
09:55 AM
That blog post is a little outdated at this point. It all depends on your Yarn configuration: - DefaultContainerExecutor: 'yarn' - LinuxContainerExectutor: - With yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users=false (default), it runs as yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user (default is 'nobody') - With yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users=true , it runs as the user submitting the workflow I'd encourage you to use any of the other actions (e.g. Java action) if possible; they will all run as the user who submitted the workflow.
... View more
05-21-2015
09:46 AM
1 Kudo
Hi, To test a custom action, I'd recommend simply running it in an Oozie server and seeing if you run into any problems. You can also attach a debugger to Oozie to help there. To save on turn-around time, you can also write a unit test that starts up LocalOozie and tries to submit your action. To attach a debugger to Oozie, you should add the following to the " Oozie Service Environment Advanced Configuration Snippet (Safety Valve)" in Cloudera Manager (if not using CM, then you'd export this in oozie-env.sh: CATALINA_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005" Besides using the debugger I mentioned above, looking at the launcher output (assuming you added print statements) can also be helpful; you can attach a debugger to the launcher job, but that's a bit harder because you need to attach it to the MR container instead of the Oozie server. Oozie is pretty easy to run anywhere. You can run it in a VM, on a cluster, or even on your Mac. When developing for Oozie, I typically run it on my Mac against a pseudo-distributed cluster, also on my Mac. You can use oozie.service.HadoopAccessorService.hadoop.configurations to point Oozie at any Hadoop cluster, or even multiple ones. Nothing special. Just make sure that your custom action's jar is in the Oozie servers classpath and that if your custom action has a sharelib, it's deployed in HDFS and Oozie sees it.
... View more
09-23-2014
05:37 PM
You need to copy or symlink the following jars from /opt/cloudera/parcels/CDH/lib/hbase to /opt/cloudera/parcels/CDH/lib/oozie/libserver hbase-common.jar
hbase-client.jar
hbase-server.jar
hbase-protocol.jar
lib/netty-*Final.jar
lib/htrace-core.jar and restart Oozie. If using packages, the above paths would be /usr/lib/hbase and /usr/lib/oozie/libserver instead. We're planning on doing this for users out-of-the-box in a future CDH release.
... View more
08-14-2014
11:53 AM
2 Kudos
IIRC, for Hue to switch between MR1 and MR2, you have to ensure that a few of the other services are also switched. For example, in CM's Hive configuration page, I believe it has a similar config property to Oozie's that lets you switch between MR1 and MR2. Make sure they are both set to MR2. You should also check if any other services have a similar property. Then restart Hue, of course.
... View more
08-14-2014
11:50 AM
1 Kudo
If you look closely, you'll see that your start node is trying to go to "sqoop-node" but your Sqoop node is named "sqoop-node1". These should match.
... View more
08-14-2014
11:49 AM
3 Kudos
Oozie runs on port 11000; IIRC, 8888 is Hue, and there's no /oozie on Hue 🙂 Try this: $ oozie admin -sharelibupdate -oozie http://vm-cluster-node1:11000/oozie Also, instead of having to specify -oozie <url> everytime you do an Oozie command, you can set the OOZIE_URL environment variable. e.g. $ export OOZIE_URL=http://vm-cluster-node1:11000/oozie
... View more
08-14-2014
10:14 AM
1 Kudo
Hi, You can definately do this. I'd have to see the actual exception to be sure, but I'm guessing by " it doesn't find the main method of Sqoop." you were getting a ClassNotFoundException on SqoopMain? SqoopMain is actually an Oozie class and is in the oozie-sharelib-sqoop-4.0.0-cdh5.0.2.jar in the sqoop sharelib. You'll need to copy this jar into your sqoopPatched sharelib. You may also need some of the other jars from the Sqoop sharelib. The safest thing is to copy the "sqoop" sharelib to "sqoopPatched" and then just replace your custom sqoop-1.4.4-cdh5.0.2.jar in there; this way, you'll be sure to have everything. - Robert
... View more
06-09-2014
12:18 PM
Hi, The launcher job sticks around the entire time the sqoop job is running (because it's running the sqoop CLI). So when using Oozie, you need to make sure that your cluster has capacity. In MR1 this was quite common if you didn't have enough map slots. In MR2, I've seen this happen if you haven't configured your Scheduler properly or if you don't have enough memory in your node manager.
... View more
02-27-2014
11:06 AM
I don't think that's the same problem described on that other forum; they said that it was OOZIE-1447 that fixed it. However, CDH 4.4.0 has OOZIE-1447 already. Can you post the error? Also keep in mind that using eval isn't recommended/supported.
... View more
02-14-2014
05:49 PM
Can you past your workflow.xml? It sounds like you're trying to use the hcat/hive metastore crededentials stuff (i.e. a <credentials> section) without adding the Credential classes to oozie-site.xml: i.e. <property>
<name>oozie.credentials.credentialclasses</name>
<value>
hcat=org.apache.oozie.action.hadoop.HCatCredentials,
hbase=org.apache.oozie.action.hadoop.HbaseCredentials,
hive2=org.apache.oozie.action.hadoop.Hive2Credentials
</value>
</property> This page has more details: http://archive.cloudera.com/cdh4/cdh/4/oozie/DG_UnifiedCredentialsModule.html
... View more
11-21-2013
09:56 AM
I haven't tested this, but can you try adding the following property to your aciton's <configuration> section? dfs.replication=3 You can also try setting the following property in your hdfs-site configuration; you may have to put it in the safety valve for HDFS if its not exposed in CM dfs.replication.max=10 As a general point, I'd strongly recommend you upgrade to a version of CDH 4.x (ideally the latest). We've made some signficant improvements in Oozie since CDH 3uX, especially for Coordinators. CDH 3 has also reached end-of-life.
... View more
10-14-2013
11:53 AM
1 Kudo
Hi, I don't have a specific answer for you, but I'd guess that because this is using regex patterns to match the filenames (or something similar), the bracket characters are "special" characters and you need to escape these characters. This is typically done by putting a "\" in front of them: e.g. "[" becomes "\[".
... View more
09-26-2013
11:25 AM
Hi Jared, What do you mean that the JobID string is empty? OOZIE-1447 is to fix an issue caused by the Sqoop action not launching an MR job, which if I understand correctly, should only be the case if you're using "eval" (which isn't meant for production use FYI). Is that what you're doing? In either case, OOZIE-1447 is in CDH 4.4.0 and later; so if this is what you're running into, upgrading to CDH 4.4.0 should fix it.
... View more