<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Pig data load problem in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110534#M46616</link>
    <description>&lt;P&gt;Below is my source in HDFS:
/abc/ &lt;/P&gt;&lt;P&gt;Hadoop is an open source&lt;/P&gt;&lt;P&gt;
MR is  to process data in hadoop. &lt;/P&gt;&lt;P&gt;Hadoop has a good eco system.&lt;/P&gt;&lt;PRE&gt;I want to do below opearation
filter_records = FILTER ya BY $0 MATCHES '.*Hadoop.*'; 
but load command is unsuccessful.Could anybody provide input on load statement?

grunt&amp;gt; ya = load '/abc/' USING TextLoader();
2016-11-17 21:00:14,470 [main] WARN  org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s).
grunt&amp;gt; yab = load '/abc/';
2016-11-17 21:00:50,199 [main] WARN  org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s).
grunt&amp;gt; 
&lt;/PRE&gt;</description>
    <pubDate>Fri, 18 Nov 2016 17:44:05 GMT</pubDate>
    <dc:creator>vamsi123</dc:creator>
    <dc:date>2016-11-18T17:44:05Z</dc:date>
    <item>
      <title>Pig data load problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110534#M46616</link>
      <description>&lt;P&gt;Below is my source in HDFS:
/abc/ &lt;/P&gt;&lt;P&gt;Hadoop is an open source&lt;/P&gt;&lt;P&gt;
MR is  to process data in hadoop. &lt;/P&gt;&lt;P&gt;Hadoop has a good eco system.&lt;/P&gt;&lt;PRE&gt;I want to do below opearation
filter_records = FILTER ya BY $0 MATCHES '.*Hadoop.*'; 
but load command is unsuccessful.Could anybody provide input on load statement?

grunt&amp;gt; ya = load '/abc/' USING TextLoader();
2016-11-17 21:00:14,470 [main] WARN  org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s).
grunt&amp;gt; yab = load '/abc/';
2016-11-17 21:00:50,199 [main] WARN  org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s).
grunt&amp;gt; 
&lt;/PRE&gt;</description>
      <pubDate>Fri, 18 Nov 2016 17:44:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110534#M46616</guid>
      <dc:creator>vamsi123</dc:creator>
      <dc:date>2016-11-18T17:44:05Z</dc:date>
    </item>
    <item>
      <title>Re: Pig data load problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110535#M46617</link>
      <description>&lt;P&gt;From the information given, there is not a load problem just an explicit warning that the data loaded is being cast to chararray (string) during the filter operation.&lt;/P&gt;&lt;P&gt;A couple points:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;If you do not specify type on load the default is bytearray.  When you are filtering, you are treating it as a string (chararray type in pig) and pig will convert the bytearray to charraray during this operation.&lt;/LI&gt;&lt;LI&gt;TextLoader() will load all data as a single record (no delimiters). &lt;/LI&gt;&lt;LI&gt;If you want to load delimited file (fields, eg a CSV) then you use PigStorage().  You can specify the delimiter, e.g. PigStorage(',') and if not specified it uses the default of tab delim. &lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;A href="http://pig.apache.org/docs/r0.16.0/basic.html#Data+Types+and+More" target="_blank"&gt;http://pig.apache.org/docs/r0.16.0/basic.html#Data+Types+and+More&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;Not sure if that is what you were looking for ... if so, let me know by accepting the answer; if not, let me know more specifics.&lt;/EM&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 18 Nov 2016 22:16:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110535#M46617</guid>
      <dc:creator>gkeys</dc:creator>
      <dc:date>2016-11-18T22:16:19Z</dc:date>
    </item>
    <item>
      <title>Re: Pig data load problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110536#M46618</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/11288/gkeys.html" nodeid="11288"&gt;@Greg Keys&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Thanks for input.your input is always appreciated.one clarification&lt;/P&gt;&lt;P&gt;Then I should get warning during below filter statement but why i got warning during load statement.In load statement i am not converting bytearray to chararray. Then why i got warning during load statement?&lt;/P&gt;&lt;P&gt;filter_records = FILTER ya BY $0 MATCHES '.*Hadoop.*';&lt;/P&gt;</description>
      <pubDate>Sat, 19 Nov 2016 01:20:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110536#M46618</guid>
      <dc:creator>vamsi123</dc:creator>
      <dc:date>2016-11-19T01:20:40Z</dc:date>
    </item>
    <item>
      <title>Re: Pig data load problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110537#M46619</link>
      <description>&lt;P&gt;Yes, I saw that.  In my environment I got it only during filter and not the load.&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;what pig version are you using?&lt;/LI&gt;&lt;LI&gt;what happens when you do: USING PigStorage() as (str:chararray);&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;In any case, it is just a warning to let you know nothing invisible is happening under the scenes.&lt;/P&gt;</description>
      <pubDate>Sat, 19 Nov 2016 01:55:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110537#M46619</guid>
      <dc:creator>gkeys</dc:creator>
      <dc:date>2016-11-19T01:55:39Z</dc:date>
    </item>
    <item>
      <title>Re: Pig data load problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110538#M46620</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/11288/gkeys.html" nodeid="11288"&gt;@Greg Keys&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;1)after using USING PigStorage() as (str:chararray); Issue is resolved.Thanks for your valuable time.&lt;/P&gt;</description>
      <pubDate>Mon, 21 Nov 2016 18:29:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110538#M46620</guid>
      <dc:creator>vamsi123</dc:creator>
      <dc:date>2016-11-21T18:29:51Z</dc:date>
    </item>
    <item>
      <title>Re: Pig data load problem</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110539#M46621</link>
      <description>&lt;P&gt;Glad it worked out Vamsi  &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 21 Nov 2016 20:10:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-data-load-problem/m-p/110539#M46621</guid>
      <dc:creator>gkeys</dc:creator>
      <dc:date>2016-11-21T20:10:05Z</dc:date>
    </item>
  </channel>
</rss>

