Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Pig Incompatable schema

Solved Go to solution

Pig Incompatable schema

Contributor

my input file is below

a.txt

aaa.kyl,data,data
bbb.kkk,data,data
cccccc.hj,data,data
qa.dff,data,data

A = LOAD '/pigdata/a.txt' USING PigStorage(',') AS(a1:chararray,a2:chararray,a3:chararray);

How to resolve below error and what is the reason for this error

ERROR:-
C = FOREACH A GENERATE STRSPLIT(a1,'\\u002E') as (a1:chararray, a1of1:chararray),a2,a3;
2017-02-03 00:45:42,803 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1031: Incompatable schema: left is "a1:chararray,a1of1:chararray", right is ":tuple()"
1 ACCEPTED SOLUTION

Accepted Solutions

Re: Pig Incompatable schema

Expert Contributor

You need to flatten the STRSPLIT before you can project.

C = FOREACH A GENERATE FLATTEN(STRSPLIT(a1,'\\u002E')) as (a1:chararray, a1of1:chararray),a2,a3;   
7 REPLIES 7

Re: Pig Incompatable schema

Expert Contributor

You need to flatten the STRSPLIT before you can project.

C = FOREACH A GENERATE FLATTEN(STRSPLIT(a1,'\\u002E')) as (a1:chararray, a1of1:chararray),a2,a3;   

Re: Pig Incompatable schema

Contributor

Thanks for input.what is the problem with my relation C.

STRSPLIT will generate tuple as output.Here it will consists of two fields in a tuple.

(a1:chararray, a1of1:chararray) is also a tuple since it is enclosed in parentheses and also consists of two fields

Highlighted

Re: Pig Incompatable schema

Contributor

Any input on my clarification

Re: Pig Incompatable schema

Expert Contributor

@vamsi valiveti

The result of the code you wrote gives the schema tike this

((a1),(a1of1)),(a2),(a3)

Now your projection wouldn't work in a data schema like this as Pig still considers the first two rows which is

"((a1),(a1of1))" as one. You need to use flatten for this case to make it into two separate columns.

Thats exactly what my code is doing. I tested your data using my code. works perfectly.

Re: Pig Incompatable schema

The output of STRSPLIT is a tuple, so if you want to provide its schema you need to explicitly say for example "t1:tuple", like below, and after that you can refer to it as t1.a1 and t1.a1of1. With FLATTEN you get rid of the tuple. So you can choose which way to declare it.

grunt> b = FOREACH a generate STRSPLIT(a1,'\\u002E') as (t1:tuple(a1:chararray,a1of1:chararray)), a2, a3;
grunt> describe b; 
b: {t1: (a1: chararray,a1of1: chararray),a2: chararray,a3: chararray}
grunt> c = foreach b generate t1.a1, a3;

Re: Pig Incompatable schema

Hi @vamsi valiveti

I noticed that you ask a lot of questions but haven't accepted many answers in the last few months. So, if you don't mind let me tell you a few words how this work: Both questions and answers can be up-voted if the others find them helpful. Also, for each question one answer can be "accepted", usually by the user who asked the question, if that answer resolved the question or greatly helped the user to resolve the issue by himself. Accepted answers help HCC to better manage answered questions, and also help the others who search and find that question later to know that the accepted answer indeed works and can be trusted. So, it will be great if you accept one answer to this question, and some other questions you recently asked. Many thanks!

Re: Pig Incompatable schema

Contributor

Thanks for comments.I will do it definately starting from this post.

Don't have an account?
Coming from Hortonworks? Activate your account here