Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

PigStorage in mapreduce mode

Solved Go to solution
Highlighted

Re: PigStorage in mapreduce mode

Expert Contributor

this is odd:

when i do

  • grunt> b = limit sourceData 5;
  • grunt>dump b;
  • i works for me also, when i dont limit result set .. .and just executing dump sourceData; im occurring same error.

    Re: PigStorage in mapreduce mode

    Mentor

    I think it crashed on me when I dumped the whole dataset, there might be a problem with your dataset further down. @John Smith

    Re: PigStorage in mapreduce mode

    Expert Contributor

    for 100% there is no problem with input dataset, i kept only first 5 records in file and its the same issue.

    Re: PigStorage in mapreduce mode

    Mentor

    @John Smith you got me there, as you see my attempt with your file worked. Alternatively take a look at CSVExcelStorage as that has more capability as opposed to PigStorage. link

    I am not saying this is the case, I don't know what's wrong but here's a note, not sure how valid it is anymore as this note has been around for a while and they don't mention which version of Pig they were using

    Limitations

    PigStorage is an extremely simple loader that does not handle special cases such as embedded delimiters or escaped control characters; it will split on every instance of the delimiter regardless of context. For this reason, when loading a CSV file it is recommended to use CSVExcelStorage rather than PigStorage with a comma delimiter.

    Re: PigStorage in mapreduce mode

    Expert Contributor

    well CSVExcelStorage doesnt work also....

    2016-02-05 16:01:28,917 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed! 2016-02-05 16:01:29,745 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias sourceData Details at logfile: /home/hdfs/pig_1454687855333.log grunt>

    Im confused... what is it.

    Re: PigStorage in mapreduce mode

    Mentor

    @John Smith if you identified another bug, I'm going to buy a lottery ticket.

    Re: PigStorage in mapreduce mode

    New Contributor

    As I commented above. I cannot reproduce the error. The error you posted is too general. Can you go to Hadoop Web UI and get the detailed message?

    Re: PigStorage in mapreduce mode

    Expert Contributor

    its strange you cant reproduce error, does it work for you?

                        Application application_1454923438220_0007 failed 2 
    times due to AM Container for appattempt_1454923438220_0007_000002 
    exited with  exitCode: 1
                                        
                        For more detailed output, check application tracking
     
    page:http://sandbox.hortonworks.com:8088/cluster/app/application_1454923438220_0007Then,
     click on links to logs of each attempt.
                                        
                        Diagnostics: Exception from container-launch.
                                        
                        Container id: container_e10_1454923438220_0007_02_000001
                                        
                        Exit code: 1
                                        
                        Stack trace: ExitCodeException exitCode=1: 
                                        
                        	at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
                                        
                        	at org.apache.hadoop.util.Shell.run(Shell.java:487)
                                        
                        	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
                                        
                        	at 
    org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
    
                                        
                        	at 
    org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    
                                        
                        	at 
    org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    
                                        
                        	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
                                        
                        	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
                                        
                        	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
                                        
                        	at java.lang.Thread.run(Thread.java:745)
                                        
                      
                      
                        Container exited with a non-zero exit code 1
                                        
                        Failing this attempt. Failing the application.
                      
    Don't have an account?
    Coming from Hortonworks? Activate your account here