Reply
Contributor
Posts: 43
Registered: ‎08-20-2013
Accepted Solution

Oozie embedded derby to mysql, what is the best way to go about this?

I knew this could come someday and now it is the day .... Oozie is running very slow and I believe it is due to using the embedded derby db.

 

Anyone has experience of migrating the data that is already in derby to an external mysql? 

 

What do I loose if I just start with a clean mysql and resubmit all the oozie jobs?

Posts: 1,572
Kudos: 295
Solutions: 241
Registered: ‎07-31-2013

Re: Oozie embedded derby to mysql, what is the best way to go about this?

The only thing you'll lose out if you do the fresh approach is the
history, but not the functionality/correctness of the newly added
jobs. I'd pick using the fresh mysql approach if it is an option,
cause delving down into extracting data from Apache Derby DB for
moving it onto MySQL is a tad more painful operation.

Cloudera Employee
Posts: 85
Registered: ‎07-31-2013

Re: Oozie embedded derby to mysql, what is the best way to go about this?

[ Edited ]

Hey,

 

Unfortunately this is something I've never been able to get working.  The problem is Oozie stores the workflows in the DB as blobs, as a result, that makes migration very complex.  The data in Oozie is the status of all the past workflows that have not been purged by the purge process and then info on all the current running oozie jobs.  So if you resubmit the jobs, you will just lose status information about old jobs.

 

If you can afford to lose the historical data and then resubmit the jobs, I would strongly recommend going that route.  If not, you can take a look at Razor SQL.  It might be able to do it, however, it's not free, but there is a free trial for a trial run.

 

Hope this helps.

Cloudera Employee
Posts: 85
Registered: ‎07-31-2013

Re: Oozie embedded derby to mysql, what is the best way to go about this?

These steps with Razor SQL will get you the historical data, but you would still need to resubmit the jobs:

 

1. Stop Oozie. 
2. Backup the original Derby DB file somewhere. 
Copy /var/lib/oozie/data/ to a backup location. 
3. Gather the info for connecting to the Derby DB. On the Oozie server run the following grep command. It will give you all the DB connect info: 
grep -A1 JPA "/var/run/cloudera-scm-agent/process/`ls -alrt /var/run/cloudera-scm-agent/process/ | grep OOZIE | tail -1 | awk '{print $9}'`/oozie-site.xml" 
<name>oozie.service.JPAService.jdbc.driver</name> 
<value>org.apache.derby.jdbc.EmbeddedDriver</value> 
<name>oozie.service.JPAService.jdbc.username</name> 
<value>sa</value> 
<name>oozie.service.JPAService.jdbc.password</name> 
<value></value> 
<name>oozie.service.JPAService.jdbc.url</name> 
<value>jdbc:derby:/var/lib/oozie/data;create=true</value> 
4. Install RazorSQL 
5. Make a copy of your of the Derby DB files for Oozie in the path mentioned in step 2 above and copy to the host with RazorSQL. 
6. From RazorSQL do the following: 
Menu/Option 
Action/Value 
a. Connections 
Add Connection Profile 
b. Derby 
-(continue) 
c. Profile Name: Oozie DB 
d. Database Directory 
Point to the directory with the Oozie DB copy mentioned in step 5. 
e. Click connect 
f. DB Tools 
Export Data 
g. Check Multiple tables 
h. Enter schema as "SA" 
i. Generate SQL Statements 
j. Generate SQL INSERT statements 
k. Do not export the DDLS!!! 
l. Export to single file, \ for escape single quotes, <SEMI-COLON> SQL statement separator 
m. Select a filename (i.e. oozie.sql), then Save 
8. Edit the resulting oozie.sql and replace "SA." with the name of the DB in your new database. 
9. Verify the oozie.sql looks good. 
10. In CM, reconfigure Oozie to point to the new DB. 
11. In CM, run "Create Database" from the Actions drop down within the Oozie service. 
12. Import the oozie.sql or oozie-processed.sql above.

Contributor
Posts: 43
Registered: ‎08-20-2013

Re: Oozie embedded derby to mysql, what is the best way to go about this?

Thanks for all your replies.  I think I will go with a fresh new mysql database, the oozie history isn't that important to me. 

 

Just one more question ...... the workflows and coordinators configured via hue ... this info is stored in the Hue database right?

Cloudera Employee
Posts: 85
Registered: ‎07-31-2013

Re: Oozie embedded derby to mysql, what is the best way to go about this?

No problem!

 

The workflow and coordinator definitions are in the Hue DB and the xml for them are in HDFS, just none of the status history, that Hue pulls from Oozie.  Everything you need to resubmit the coordinators and workflows is in the Hue DB and HDFS and clearing the Oozie DB won't cause any problems there.

 

Thanks

Chris

New Contributor
Posts: 2
Registered: ‎01-16-2018

Re: Oozie embedded derby to mysql, what is the best way to go about this?

Hi @cconner

 

I've connected to the Hue database on Mysql.  I see all the the tables prefixed with oozie_ however I do not see any meaninful data in these tables.  Can you explain where in this DB schema the workflow definitions are stored?

 

Thanks

 

Shak

Announcements