Support Questions

Find answers, ask questions, and share your expertise

Filtering {()} records from a relation

avatar
Expert Contributor

Let us say i have below relation x.I can filter {} these records using size==0 or IsEmpty($1)

Urman,{(100)}
Gietz,{()}
LAST_NAME,{}

clarification:-
How to filter  records with {()}.I mean i need only Gietz,{()} from relation x.
1 ACCEPTED SOLUTION

avatar
Master Mentor

@vamsi valiveti you need to escape parenthesis with double forward slashes

grunt> a = load 'data' using PigStorage(',');
grunt> b = filter a by ($1 matches '{\\(\\)}');
2017-02-01 02:45:07,159 [main] WARN  org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s).
grunt> dump b;

output

Output(s):
Successfully stored 1 records (17 bytes) in: "hdfs://sandbox.hortonworks.com:8020/tmp/temp-1129941617/tmp-1428622787"


2017-02-01 02:49:30,801 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2017-02-01 02:49:30,811 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2017-02-01 02:49:30,811 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(Gietz,{()})

View solution in original post

1 REPLY 1

avatar
Master Mentor

@vamsi valiveti you need to escape parenthesis with double forward slashes

grunt> a = load 'data' using PigStorage(',');
grunt> b = filter a by ($1 matches '{\\(\\)}');
2017-02-01 02:45:07,159 [main] WARN  org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s).
grunt> dump b;

output

Output(s):
Successfully stored 1 records (17 bytes) in: "hdfs://sandbox.hortonworks.com:8020/tmp/temp-1129941617/tmp-1428622787"


2017-02-01 02:49:30,801 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2017-02-01 02:49:30,811 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2017-02-01 02:49:30,811 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(Gietz,{()})