<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Pig Error : ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Error-ERROR-org-apache-pig-tools-grunt-Grunt-ERROR-1000/m-p/172988#M50232</link>
    <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;While trying to process my data in pig which is a csv dataset from here &lt;A target="_blank" href="https://vincentarelbundock.github.io/Rdatasets/datasets.html"&gt;Link&lt;/A&gt; I'm getting the below error .There is some delimitter problem here in the file.If i create the same file manually i'm able to see the data is getting loaded properly.&lt;/P&gt;&lt;P&gt;Pig Script:&lt;/P&gt;&lt;P&gt;A = LOAD 's3a://byr-heor-test/dev1/BJsales.csv' using PigStorage(',') as (Num:Int,time:int,BJsales:int)&lt;/P&gt;&lt;P&gt;Output:&lt;/P&gt;&lt;PRE&gt;..
..
(149,149,262)
(150,150,262)
2016-12-27 09:31:35,632 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Encountered " &amp;lt;PATH&amp;gt; "2 "" at line 3, column 8.
Was expecting one of:


&lt;/PRE&gt;</description>
    <pubDate>Tue, 27 Dec 2016 22:41:55 GMT</pubDate>
    <dc:creator>kumarvaibhav199</dc:creator>
    <dc:date>2016-12-27T22:41:55Z</dc:date>
    <item>
      <title>Pig Error : ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Error-ERROR-org-apache-pig-tools-grunt-Grunt-ERROR-1000/m-p/172988#M50232</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;While trying to process my data in pig which is a csv dataset from here &lt;A target="_blank" href="https://vincentarelbundock.github.io/Rdatasets/datasets.html"&gt;Link&lt;/A&gt; I'm getting the below error .There is some delimitter problem here in the file.If i create the same file manually i'm able to see the data is getting loaded properly.&lt;/P&gt;&lt;P&gt;Pig Script:&lt;/P&gt;&lt;P&gt;A = LOAD 's3a://byr-heor-test/dev1/BJsales.csv' using PigStorage(',') as (Num:Int,time:int,BJsales:int)&lt;/P&gt;&lt;P&gt;Output:&lt;/P&gt;&lt;PRE&gt;..
..
(149,149,262)
(150,150,262)
2016-12-27 09:31:35,632 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Encountered " &amp;lt;PATH&amp;gt; "2 "" at line 3, column 8.
Was expecting one of:


&lt;/PRE&gt;</description>
      <pubDate>Tue, 27 Dec 2016 22:41:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Error-ERROR-org-apache-pig-tools-grunt-Grunt-ERROR-1000/m-p/172988#M50232</guid>
      <dc:creator>kumarvaibhav199</dc:creator>
      <dc:date>2016-12-27T22:41:55Z</dc:date>
    </item>
    <item>
      <title>Re: Pig Error : ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Error-ERROR-org-apache-pig-tools-grunt-Grunt-ERROR-1000/m-p/172989#M50233</link>
      <description>&lt;P&gt;looking at the BJsales.csv file it seems the first column is string type. Make sure to use proper datatypes. Also remove any empty rows are end of file.&lt;/P&gt;</description>
      <pubDate>Tue, 27 Dec 2016 23:06:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Error-ERROR-org-apache-pig-tools-grunt-Grunt-ERROR-1000/m-p/172989#M50233</guid>
      <dc:creator>mpandit</dc:creator>
      <dc:date>2016-12-27T23:06:05Z</dc:date>
    </item>
    <item>
      <title>Re: Pig Error : ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Error-ERROR-org-apache-pig-tools-grunt-Grunt-ERROR-1000/m-p/172990#M50234</link>
      <description>&lt;P&gt;Every Field is  a Integer or float here so i gave int to all.&lt;/P&gt;</description>
      <pubDate>Tue, 27 Dec 2016 23:10:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Error-ERROR-org-apache-pig-tools-grunt-Grunt-ERROR-1000/m-p/172990#M50234</guid>
      <dc:creator>kumarvaibhav199</dc:creator>
      <dc:date>2016-12-27T23:10:56Z</dc:date>
    </item>
    <item>
      <title>Re: Pig Error : ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Error-ERROR-org-apache-pig-tools-grunt-Grunt-ERROR-1000/m-p/172991#M50235</link>
      <description>&lt;P&gt;To add to &lt;A rel="user" href="https://community.cloudera.com/users/9842/mpandit.html" nodeid="9842" target="_blank"&gt;@milind pandit&lt;/A&gt;, tried opening the AirPassengers file. The first column is enclosed in quotes. This is the same for BJsales.csv as well. &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="10833-hcc.png" style="width: 810px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/19980i31F989C66DBB81E6/image-size/medium?v=v2&amp;amp;px=400" role="button" title="10833-hcc.png" alt="10833-hcc.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 10:30:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Error-ERROR-org-apache-pig-tools-grunt-Grunt-ERROR-1000/m-p/172991#M50235</guid>
      <dc:creator>arunak</dc:creator>
      <dc:date>2019-08-18T10:30:36Z</dc:date>
    </item>
    <item>
      <title>Re: Pig Error : ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Error-ERROR-org-apache-pig-tools-grunt-Grunt-ERROR-1000/m-p/172992#M50236</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11938/kumarvaibhav1992.html" nodeid="11938"&gt;@Vaibhav Kumar&lt;/A&gt;
&lt;/P&gt;&lt;P&gt;recommendations from my colleagues are valid, you have strings in header row of your CSV documents. You can certainly filter by some known entity but there's a more advanced version of CSV Pig Loader called CSVExcelStorage. It is part of Piggybank library that comes bundled with HDP, hence the register command. You can pass different control parameters to it. Mortar blog is an excellent source of information on working with Pig &lt;A href="http://help.mortardata.com/technologies/pig/csv" target="_blank"&gt;http://help.mortardata.com/technologies/pig/csv&lt;/A&gt;.&lt;/P&gt;&lt;PRE&gt;grunt&amp;gt; register /usr/hdp/current/pig-client/piggybank.jar;
grunt&amp;gt; a = load 'BJsales.csv' using org.apache.pig.piggybank.storage.CSVExcelStorage(',', 'NO_MULTILINE', 'NOCHANGE', 'SKIP_INPUT_HEADER') as (Num:Int,time:int,BJsales:float);
grunt&amp;gt; describe a;
a: {Num: int,time: int,BJsales: float}
grunt&amp;gt; b = limit a 5;
grunt&amp;gt; dump b;
&lt;/PRE&gt;&lt;P&gt;output&lt;/P&gt;&lt;PRE&gt;(1,1,200.1)
(2,2,199.5)
(3,3,199.4)
(4,4,198.9)
(5,5,199.0)
&lt;/PRE&gt;&lt;P&gt;notice I am not filtering any relation, I'm telling the loader to skip header outright, it saves a few key strokes and doesn't waste any cycles processing anything extra.&lt;/P&gt;</description>
      <pubDate>Wed, 01 Feb 2017 11:21:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Error-ERROR-org-apache-pig-tools-grunt-Grunt-ERROR-1000/m-p/172992#M50236</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2017-02-01T11:21:19Z</dc:date>
    </item>
  </channel>
</rss>

