Support Questions
Find answers, ask questions, and share your expertise

Store Pig output to MSSQL using DBStorage

Store Pig output to MSSQL using DBStorage

Explorer

Hi,

I want to store Pig output to mssql server using DBStorage.I tried with below script,

REGISTER /usr/hdp/2.5.0.0-1245/pig/lib/piggybank.jar;
REGISTER /usr/hdp/2.5.0.0-1245/pig/lib/sqljdbc41.jar;
A= LOAD '/user/Employee.csv' USING PigStorage(',') ;
G = FOREACH A GENERATE $0 as id:int,$1 as name:chararray,$2 as email:chararray;
STORE G INTO 'emp' USING org.apache.pig.piggybank.storage.DBStorage('com.microsoft.sqlserver.jdbc.SQLServerDriver', 'jdbc:sqlserver://xxx.x.xx.xx:1433;databaseName=test', 'username', 'password', 'INSERT INTO emp (id,name,email) VALUES (G.id,G.name,G.email)');

But it throws following error :

Error: Failure while running task:org.apache.pig.backend.executionengine.ExecException: ERROR 2135: Received error from store function.java.lang.RuntimeException: JDBC error at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:148) at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:376) at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:241) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422)
Input(s):
Failed to read data from "/user/Employee.csv"
Output(s):
Failed to produce result in "hdfs://xxxxxxxxxx:8020/user/root/emp"
4 REPLIES 4
Highlighted

Re: Store Pig output to MSSQL using DBStorage

Expert Contributor

Hi @priyal patel,

Can you check whether you able to dump the data which you are reading before storing it?

  1. A= LOAD '/user/Employee.csv' USING PigStorage(',')
  2. dump A;
  3. G = FOREACH A GENERATE $0 as id:int,$1 as name:chararray,$2 as email:chararray;
  4. dump B;

-Shubham

Highlighted

Re: Store Pig output to MSSQL using DBStorage

Explorer

@schhabra

Hi Shubham,

I am able to dump the data successfully but i am getting error with STORE function.

I tried below script :

STORE G INTO 'emp' USING org.apache.pig.piggybank.storage.DBStorage('com.microsoft.sqlserver.jdbc.SQLServerDriver', 'jdbc:sqlserver://xxx.x.xx.xx:1433;databaseName=test', 'username', 'password', 'INSERT INTO emp (id,name,email) VALUES (?,?,?)');

then it store the exact count of my input file with NULL values in mssql but i am getting issue with fetching values from a relation in Pig.

I tried below script but it didn't work.

 STORE G INTO 'emp' USING org.apache.pig.piggybank.storage.DBStorage('com.microsoft.sqlserver.jdbc.SQLServerDriver','jdbc:sqlserver://xxx.x.xx.xx:1433;databaseName=test','username','password','INSERT INTO emp (id,name,email) VALUES (G.id,G.name,G.email)');
Highlighted

Re: Store Pig output to MSSQL using DBStorage

STORE G INTO 'emp' USING org.apache.pig.piggybank.storage.DBStorage('com.microsoft.sqlserver.jdbc.SQLServerDriver', 'jdbc:sqlserver://xxx.x.xx.xx:1433;databaseName=test', 'username', 'password', 'INSERT INTO emp (id,name,email) VALUES (?,?,?)');

This should work.

Can you please check the data type of the columns at MSSQL Server side?

Highlighted

Re: Store Pig output to MSSQL using DBStorage

Explorer
@Rahul Soni

Hi, I already tried the script which you sent but it insert NULL values.I mention that in my previous comment.

Datatype of columns in Pig is (id:int,name:chararray,salary:float) and in MSSQL is (id int, name varchar, salary float).I tried with different datatype also but It insert only null values.I am not able to fetch the values from Pig relation G which I mention on my question.