Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Migrating SQL or PLSQL language to Pig and Hive

Solved Go to solution

Migrating SQL or PLSQL language to Pig and Hive

New Contributor

Can someone provide a rough order of magnitude/level-of-effort for converting or migrating PLSQL language (5,000-10,000 lines of code need to be migrated to Pig and Hive)?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Migrating SQL or PLSQL language to Pig and Hive

@Steve Shilling

You may want to look into this

It's not easy task to estimate this ...I would say 1 month (Just picked up random number)

4 REPLIES 4

Re: Migrating SQL or PLSQL language to Pig and Hive

@Steve Shilling

You may want to look into this

It's not easy task to estimate this ...I would say 1 month (Just picked up random number)

Re: Migrating SQL or PLSQL language to Pig and Hive

There is also HPL/SQL.

http://www.hplsql.org/doc

It was incorporated into the Hive 2.0 release (Feb 16th) and thus will be automatically included in the next release of HDP that includes Hive 2.0.

It can be very easily added to an existing HDP install as well.

It is an implementation of stored procs, control flow logic, temporary variables, error checking, on-the-fly SQL syntax conversion to HiveQL, RDBMS SQL syntax compatibility, etc. It brings almost all the functionality of DB2, SQL Server, Oracle stored procs and SQL syntax into Hive. An added bonus is that you can mix & match the syntax of SQL Server T-SQL, Oracle PLSQL and DB2. It is slick.

For example - here is a sample script that runs in HPL/SQL (for real):

create procedure spLoadSalesFact (LoadMonth char(10)) 
BEGIN
   SET plhql.onerror = seterror; 

   print 'Beginning execution...'; 
   print '*' || LoadMonth || '*'; 

   /* Fail if invalid Input Parameter 
   set LoadMonth = Upper(LoadMonth); 
   if (LoadMonth not in ('LAST', 'CURRENT'))
      BEGIN
      print 'Bad Input Paramter'; 
      RETURN -1;
      END
    */
   /* Load Prior Month */
   IF (LoadMonth = 'LAST') 
      BEGIN
      print 'Inserting Last Month''s Data...'; 
      insert into table saleslineitem 
      select * from saleslineitem 
      where TransactionDate >= '2011-06-01' 
        and transactionDate < '2011-07-01';
      if SQLCODE <> 0 
         RETURN -2;
      END
   /* Load Current Month */
   ELSE IF (LoadMonth = 'CURRENT') 
      BEGIN
      print 'Inserting Current month''s data...'; 
      insert into table saleslineitem 
      select * from saleslineitem 
      where TransactionDate >= '2011-07-01' 
        and transactionDate < '2011-08-01';
      if SQLCODE <> 0 
         RETURN -3;
      END
   PRINT 'Execution Completed...'; 
END
/* Stored Proc invocation */
-- CALL spLoadSalesFact ('GARBAGE'); 
-- CALL spLoadSalesFact ('LAST');
CALL spLoadSalesFact ('CURRENT');   

Re: Migrating SQL or PLSQL language to Pig and Hive

Are there any performance improvements of not creating custom UDF function ? I have currently some text-analytics functions, for which I can either create UDF jar files or use HPLSQL.@bpreachuk Are there any performance issues that you faced on using HPLSQL ?

Re: Migrating SQL or PLSQL language to Pig and Hive

Hi @tanmoy. I would recommend you go the custom route - create UDF jars for Hive. That way you can use them anywhere you use Hive code. HPL/SQL is still not a GA feature, and in my opinion HPL/SQL use a little bit more time to "bake" and become more robust.

Don't have an account?
Coming from Hortonworks? Activate your account here