- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 09-29-2017 09:47 AM
PROBLEM: Oozie jobs gets stuck in PREP mode
ROOT CAUSE : Below are the possible reasons:
1. Wrong Namenode host/port in job.properties
2. Wrong Resource manager host/port in the configurations.
If there are lot of jobs stuck in the
RESOLUTION :
1. Stop Oozie server from Ambari.
2. Backup Oozie DB is cluster is production.
3. Remove entries for stuck jobs from below tables
WF_JOBS
COORD_JOBS
WF_ACTIONS
4. Start Oozie server
Created on 07-19-2019 05:09 AM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
I tried the above steps, it didn't work.
If you run your workflow:
Use "-run" command : oozie job -oozie http://localhost:11000/oozie -config job.properties -run
While you submit the coord-job use:
oozie job -oozie http://localhost:11000/oozie -config job.properties -submit
Created on 09-26-2019 11:15 AM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
I wanted to interject that while both the above are definite valid possible causes of Oozie jobs stuck in PREP state, there may be several other possible causes which may need to be resolved such as:
1. Issues with the Yarn Resource Manager / MR Job Tracker, lack of resources either for the RM or queues for the user running the job.
2. Problems with the Oozie server getting to the oozie database server, the database server itself, or locks on tables.
3. Lack of resources to Oozie such as callable queues, java heap, GC thrashing, etc.
The above is a brief shortlist from review of support cases relating to Oozie jobs stuck in PREP. I want to emphasize that deleting records from the Oozie database should be ONLY done the last resort to solving this problem, and only needed if you have a very large mass of oozie workflows that cannot be killed in a timely fashion by an oozie CLI script. This should be only done at the direction of support, people knowledgeable with SQL, and the relationship between tables, columns, and rows in the oozie database as referential integrity and constraints are lacking in the schema design. The above post from 2017 also missed one key table COORD_ACTIONS, where if this data was not properly cleaned up, would break your Oozie purge and possibly cause other serious problems.