<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Pig  ERROR 1066 in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-ERROR-1066/m-p/109215#M33658</link>
    <description>&lt;P&gt;&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/5421-employeeinfo.txt"&gt;employeeinfo.txt&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/5422-salaryinfo.txt"&gt;salaryinfo.txt&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I don't know why but pig is not running and giving me an error 1066 every time I run it. I have the data attached and this is the script I'm running.&lt;/P&gt;&lt;P&gt;Can anyone help?&lt;/P&gt;&lt;P&gt;a = load '/pigsample/Salaryinfo.csv' USING PigStorage(','); 
&lt;/P&gt;&lt;P&gt;b = load '/pigsample/Employeeinfo.csv' USING PigStorage(','); &lt;/P&gt;&lt;P&gt;c = filter b by $4 =='Male'; &lt;/P&gt;&lt;P&gt;d = foreach c generate $0 as id:int, $1 as firstname:chararray, $2 as lastname:chararray, $4 as gender:chararray, $6 as city:chararray , $7 as country:chararray, $8 as countrycode:chararray; &lt;/P&gt;&lt;P&gt;e = foreach a generate $0 as iD:int, $1 as firstname:chararray, $2 as lastname:chararray, $3 as salary:double, ToDate($4, 'MM/dd/yyyy') as dateofhire, $5 as company:chararray; &lt;/P&gt;&lt;P&gt;f = join d by id, e by iD; &lt;/P&gt;&lt;P&gt;g = foreach f generate f.d::firstname as firstname; &lt;/P&gt;&lt;P&gt;dump g&lt;/P&gt;&lt;P&gt;--------------------------------------------**********************************************************----------------------------------------------------&lt;/P&gt;&lt;P&gt;this is the input and output I get from the shell with the describe and all&lt;/P&gt;&lt;P&gt;grunt&amp;gt; a = load '/pigsample/Salaryinfo.csv' USING PigStorage(','); &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe a Schema for a unknown. &lt;/P&gt;&lt;P&gt;grunt&amp;gt; b = load '/pigsample/Employeeinfo.csv' USING PigStorage(','); &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe b Schema for b unknown. &lt;/P&gt;&lt;P&gt;grunt&amp;gt; c = filter b by $4 =='Male'; &lt;/P&gt;&lt;P&gt;2016-07-01 19:02:16,356 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s). &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe c &lt;/P&gt;&lt;P&gt;2016-07-01 19:02:21,611 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s). Schema for c unknown. &lt;/P&gt;&lt;P&gt;grunt&amp;gt; d = foreach c generate $0 as id:int, $1 as firstname:chararray, $2 as lastname:chararray, $4 as gender:chararray, $6 as city:chararray , $7 as country:chararray, $8 as countrycode:chararray; &lt;/P&gt;&lt;P&gt;2016-07-01 19:02:35,684 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s). &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe d &lt;/P&gt;&lt;P&gt;2016-07-01 19:02:40,638 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s). &lt;/P&gt;&lt;P&gt;d: {id: int,firstname: chararray,lastname: chararray,gender: chararray,city: chararray,country: chararray,countrycode: chararray} grunt&amp;gt; e = foreach a generate $0 as iD:int, $1 as firstname:chararray, $2 as lastname:chararray, $3 as salary:double, ToDate($4, 'MM/dd/yyyy') as dateofhire, $5 as company:chararray; &lt;/P&gt;&lt;P&gt;2016-07-01 19:44:03,703 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). 2016-07-01 19:44:03,703 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe e &lt;/P&gt;&lt;P&gt;2016-07-01 19:44:09,159 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). 2016-07-01 19:44:09,159 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s). &lt;/P&gt;&lt;P&gt;e: {iD: int,firstname: chararray,lastname: chararray,salary: double,dateofhire: datetime,company: chararray} grunt&amp;gt; f = join d by id, e by iD; &lt;/P&gt;&lt;P&gt;2016-07-01 19:44:34,194 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). &lt;/P&gt;&lt;P&gt;2016-07-01 19:44:34,194 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe f &lt;/P&gt;&lt;P&gt;2016-07-01 19:44:38,955 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). 2016-07-01 19:44:38,955 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). &lt;/P&gt;&lt;P&gt;f: {d::id: int,d::firstname: chararray,d::lastname: chararray,d::gender: chararray,d::city: chararray,d::country: chararray,d::countrycode: chararray,e::iD: int,e::firstname: chararray,e::lastname: chararray,e::salary: double,e::dateofhire: datetime,e::company: chararray} &lt;/P&gt;&lt;P&gt;grunt&amp;gt; g = foreach f generate f.d::firstname as firstname; &lt;/P&gt;&lt;P&gt;2016-07-01 19:45:03,037 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). 2016-07-01 19:45:03,037 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe g &lt;/P&gt;&lt;P&gt;2016-07-01 19:45:08,432 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). 2016-07-01 19:45:08,432 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). &lt;/P&gt;&lt;P&gt;g: {firstname: chararray} &lt;/P&gt;&lt;P&gt;grunt&amp;gt; dump g &lt;/P&gt;&lt;P&gt;2016-07-01 19:45:13,698 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). &lt;/P&gt;&lt;P&gt;2016-07-01 19:45:13,698 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). 2016-07-01 19:45:13,725 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER 2016-07-01 19:45:13,773 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2016-07-01 19:45:13,812 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]} &lt;/P&gt;&lt;P&gt;2016-07-01 19:45:13,950 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2016-07-01 19:45:13,965 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - number of input files: -1 2016-07-01 19:45:13,988 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage-&amp;gt;POForEach to POPackage(JoinPackager) 2016-07-01 19:45:13,997 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3 2016-07-01 19:45:13,997 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - Merged 1 map-only splittees. 2016-07-01 19:45:13,997 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - Merged 1 out of total 3 MR operators. 2016-07-01 19:45:13,997 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 2 2016-07-01 19:45:14,155 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: &lt;A href="http://hwhdpm" target="_blank"&gt;http://hwhdpm&lt;/A&gt; 2016-07-01 19:45:14,163 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hwhdpmaster02.centralus.cloudapp.azure.com/10.0.1.5:8050 2016-07-01 19:45:14,376 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job 2016-07-01 19:45:14,382 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2016-07-01 19:45:14,385 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers. 2016-07-01 19:45:14,386 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator 2016-07-01 19:45:14,395 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=2500 2016-07-01 19:45:14,395 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1 2016-07-01 19:45:14,395 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process 2016-07-01 19:45:14,922 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/hdp/2.3.4.0-3485/pig/pig-0.15.0.2.3.4.0-3485-core-h2.jar to DistributedCache through /tmp/temp1278836613/tmp2008058395/pig-0.15.0.2.3.4.0-3485-core-h2.jar 2016-07-01 19:45:15,064 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/hdp/2.3.4.0-3485/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1278836613/tmp-244128717/automaton-1.11-8.jar 2016-07-01 19:45:15,193 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/hdp/2.3.4.0-3485/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1278836613/tmp-1145480432/antlr-runtime-3.4.jar 2016-07-01 19:45:15,339 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/hdp/2.3.4.0-3485/hadoop-mapreduce/joda-time-2.9.1.jar to DistributedCache through /tmp/temp1278836613/tmp530457831/joda-time-2.9.1.jar 2016-07-01 19:45:15,386 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up multi store job 2016-07-01 19:45:15,394 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code. 2016-07-01 19:45:15,395 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche 2016-07-01 19:45:15,395 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize [] 2016-07-01 19:45:15,498 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2016-07-01 19:45:15,624 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: &lt;A href="http://hwhdpm" target="_blank"&gt;http://hwhdpm&lt;/A&gt; 2016-07-01 19:45:15,625 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hwhdpmaster02.centralus.cloudapp.azure.com/10.0.1.5:8050 2016-07-01 19:45:15,934 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String). 2016-07-01 19:45:16,009 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2016-07-01 19:45:16,009 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2016-07-01 19:45:16,037 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2016-07-01 19:45:16,042 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2016-07-01 19:45:16,042 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2016-07-01 19:45:16,045 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2016-07-01 19:45:16,419 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:2 2016-07-01 19:45:16,667 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1467387563416_0004 2016-07-01 19:45:16,839 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources. 2016-07-01 19:45:17,136 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1467387563416_0004 2016-07-01 19:45:17,181 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: &lt;A href="http://hwhdpmaster02.c" target="_blank"&gt;http://hwhdpmaster02.c&lt;/A&gt; 2016-07-01 19:45:17,182 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1467387563416_0004 2016-07-01 19:45:17,182 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a,b,c,d,e,f 2016-07-01 19:45:17,182 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[1,4],e[5,4],f[6,4],b[2,4],c[3,4],d[4,4],f[6,4] C: R: 2016-07-01 19:45:17,197 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2016-07-01 19:45:17,198 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1467387563416_0004] 2016-07-01 19:45:46,336 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete 2016-07-01 19:45:46,336 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1467387563416_0004] 2016-07-01 19:45:47,346 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure. 2016-07-01 19:45:47,346 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1467387563416_0004 has failed! Stop running all dependent jobs 2016-07-01 19:45:47,346 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2016-07-01 19:45:47,517 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: &lt;A href="http://hwhdpm" target="_blank"&gt;http://hwhdpm&lt;/A&gt; 2016-07-01 19:45:47,518 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hwhdpmaster02.centralus.cloudapp.azure.com/10.0.1.5:8050 2016-07-01 19:45:47,528 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server 2016-07-01 19:45:47,824 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be cast to java.lang.Integer 2016-07-01 19:45:47,824 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed! 2016-07-01 19:45:47,831 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.7.1.2.3.4.0-3485 0.15.0.2.3.4.0-3485 2016-07-01 19:45:14 2016-07-01 19:45:47 HASH_JOIN,FILTER Failed! Failed Jobs: JobId Alias Feature Message Outputs job_1467387563416_0004 a,b,c,d,e,f HASH_JOIN,MULTI_QUERY Message: Job failed! Input(s): Failed to read data from "/pigsample/Employeeinfo.csv" Failed to read data from "/pigsample/Salaryinfo.csv" Output(s): Counters: Total records written : 0 Total bytes written : 0 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1467387563416_0004 -&amp;gt; null, null 2016-07-01 19:45:47,831 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed! 2016-07-01 19:45:47,833 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias g Details at logfile: /home//pig_1467399695184.log grunt&amp;gt;&lt;/P&gt;</description>
    <pubDate>Sat, 02 Jul 2016 03:02:38 GMT</pubDate>
    <dc:creator>doug_mengistu</dc:creator>
    <dc:date>2016-07-02T03:02:38Z</dc:date>
    <item>
      <title>Pig  ERROR 1066</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-ERROR-1066/m-p/109215#M33658</link>
      <description>&lt;P&gt;&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/5421-employeeinfo.txt"&gt;employeeinfo.txt&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/5422-salaryinfo.txt"&gt;salaryinfo.txt&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I don't know why but pig is not running and giving me an error 1066 every time I run it. I have the data attached and this is the script I'm running.&lt;/P&gt;&lt;P&gt;Can anyone help?&lt;/P&gt;&lt;P&gt;a = load '/pigsample/Salaryinfo.csv' USING PigStorage(','); 
&lt;/P&gt;&lt;P&gt;b = load '/pigsample/Employeeinfo.csv' USING PigStorage(','); &lt;/P&gt;&lt;P&gt;c = filter b by $4 =='Male'; &lt;/P&gt;&lt;P&gt;d = foreach c generate $0 as id:int, $1 as firstname:chararray, $2 as lastname:chararray, $4 as gender:chararray, $6 as city:chararray , $7 as country:chararray, $8 as countrycode:chararray; &lt;/P&gt;&lt;P&gt;e = foreach a generate $0 as iD:int, $1 as firstname:chararray, $2 as lastname:chararray, $3 as salary:double, ToDate($4, 'MM/dd/yyyy') as dateofhire, $5 as company:chararray; &lt;/P&gt;&lt;P&gt;f = join d by id, e by iD; &lt;/P&gt;&lt;P&gt;g = foreach f generate f.d::firstname as firstname; &lt;/P&gt;&lt;P&gt;dump g&lt;/P&gt;&lt;P&gt;--------------------------------------------**********************************************************----------------------------------------------------&lt;/P&gt;&lt;P&gt;this is the input and output I get from the shell with the describe and all&lt;/P&gt;&lt;P&gt;grunt&amp;gt; a = load '/pigsample/Salaryinfo.csv' USING PigStorage(','); &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe a Schema for a unknown. &lt;/P&gt;&lt;P&gt;grunt&amp;gt; b = load '/pigsample/Employeeinfo.csv' USING PigStorage(','); &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe b Schema for b unknown. &lt;/P&gt;&lt;P&gt;grunt&amp;gt; c = filter b by $4 =='Male'; &lt;/P&gt;&lt;P&gt;2016-07-01 19:02:16,356 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s). &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe c &lt;/P&gt;&lt;P&gt;2016-07-01 19:02:21,611 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s). Schema for c unknown. &lt;/P&gt;&lt;P&gt;grunt&amp;gt; d = foreach c generate $0 as id:int, $1 as firstname:chararray, $2 as lastname:chararray, $4 as gender:chararray, $6 as city:chararray , $7 as country:chararray, $8 as countrycode:chararray; &lt;/P&gt;&lt;P&gt;2016-07-01 19:02:35,684 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s). &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe d &lt;/P&gt;&lt;P&gt;2016-07-01 19:02:40,638 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s). &lt;/P&gt;&lt;P&gt;d: {id: int,firstname: chararray,lastname: chararray,gender: chararray,city: chararray,country: chararray,countrycode: chararray} grunt&amp;gt; e = foreach a generate $0 as iD:int, $1 as firstname:chararray, $2 as lastname:chararray, $3 as salary:double, ToDate($4, 'MM/dd/yyyy') as dateofhire, $5 as company:chararray; &lt;/P&gt;&lt;P&gt;2016-07-01 19:44:03,703 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). 2016-07-01 19:44:03,703 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe e &lt;/P&gt;&lt;P&gt;2016-07-01 19:44:09,159 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). 2016-07-01 19:44:09,159 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 1 time(s). &lt;/P&gt;&lt;P&gt;e: {iD: int,firstname: chararray,lastname: chararray,salary: double,dateofhire: datetime,company: chararray} grunt&amp;gt; f = join d by id, e by iD; &lt;/P&gt;&lt;P&gt;2016-07-01 19:44:34,194 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). &lt;/P&gt;&lt;P&gt;2016-07-01 19:44:34,194 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe f &lt;/P&gt;&lt;P&gt;2016-07-01 19:44:38,955 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). 2016-07-01 19:44:38,955 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). &lt;/P&gt;&lt;P&gt;f: {d::id: int,d::firstname: chararray,d::lastname: chararray,d::gender: chararray,d::city: chararray,d::country: chararray,d::countrycode: chararray,e::iD: int,e::firstname: chararray,e::lastname: chararray,e::salary: double,e::dateofhire: datetime,e::company: chararray} &lt;/P&gt;&lt;P&gt;grunt&amp;gt; g = foreach f generate f.d::firstname as firstname; &lt;/P&gt;&lt;P&gt;2016-07-01 19:45:03,037 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). 2016-07-01 19:45:03,037 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). &lt;/P&gt;&lt;P&gt;grunt&amp;gt; describe g &lt;/P&gt;&lt;P&gt;2016-07-01 19:45:08,432 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). 2016-07-01 19:45:08,432 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). &lt;/P&gt;&lt;P&gt;g: {firstname: chararray} &lt;/P&gt;&lt;P&gt;grunt&amp;gt; dump g &lt;/P&gt;&lt;P&gt;2016-07-01 19:45:13,698 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning USING_OVERLOADED_FUNCTION 1 time(s). &lt;/P&gt;&lt;P&gt;2016-07-01 19:45:13,698 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). 2016-07-01 19:45:13,725 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,FILTER 2016-07-01 19:45:13,773 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2016-07-01 19:45:13,812 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]} &lt;/P&gt;&lt;P&gt;2016-07-01 19:45:13,950 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false 2016-07-01 19:45:13,965 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - number of input files: -1 2016-07-01 19:45:13,988 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage-&amp;gt;POForEach to POPackage(JoinPackager) 2016-07-01 19:45:13,997 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3 2016-07-01 19:45:13,997 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - Merged 1 map-only splittees. 2016-07-01 19:45:13,997 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - Merged 1 out of total 3 MR operators. 2016-07-01 19:45:13,997 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 2 2016-07-01 19:45:14,155 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: &lt;A href="http://hwhdpm" target="_blank"&gt;http://hwhdpm&lt;/A&gt; 2016-07-01 19:45:14,163 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hwhdpmaster02.centralus.cloudapp.azure.com/10.0.1.5:8050 2016-07-01 19:45:14,376 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job 2016-07-01 19:45:14,382 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2016-07-01 19:45:14,385 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Reduce phase detected, estimating # of required reducers. 2016-07-01 19:45:14,386 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator 2016-07-01 19:45:14,395 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=2500 2016-07-01 19:45:14,395 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1 2016-07-01 19:45:14,395 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process 2016-07-01 19:45:14,922 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/hdp/2.3.4.0-3485/pig/pig-0.15.0.2.3.4.0-3485-core-h2.jar to DistributedCache through /tmp/temp1278836613/tmp2008058395/pig-0.15.0.2.3.4.0-3485-core-h2.jar 2016-07-01 19:45:15,064 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/hdp/2.3.4.0-3485/pig/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp1278836613/tmp-244128717/automaton-1.11-8.jar 2016-07-01 19:45:15,193 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/hdp/2.3.4.0-3485/pig/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp1278836613/tmp-1145480432/antlr-runtime-3.4.jar 2016-07-01 19:45:15,339 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/usr/hdp/2.3.4.0-3485/hadoop-mapreduce/joda-time-2.9.1.jar to DistributedCache through /tmp/temp1278836613/tmp530457831/joda-time-2.9.1.jar 2016-07-01 19:45:15,386 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up multi store job 2016-07-01 19:45:15,394 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code. 2016-07-01 19:45:15,395 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche 2016-07-01 19:45:15,395 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize [] 2016-07-01 19:45:15,498 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2016-07-01 19:45:15,624 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: &lt;A href="http://hwhdpm" target="_blank"&gt;http://hwhdpm&lt;/A&gt; 2016-07-01 19:45:15,625 [JobControl] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hwhdpmaster02.centralus.cloudapp.azure.com/10.0.1.5:8050 2016-07-01 19:45:15,934 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String). 2016-07-01 19:45:16,009 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2016-07-01 19:45:16,009 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2016-07-01 19:45:16,037 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2016-07-01 19:45:16,042 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2016-07-01 19:45:16,042 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1 2016-07-01 19:45:16,045 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2016-07-01 19:45:16,419 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:2 2016-07-01 19:45:16,667 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_1467387563416_0004 2016-07-01 19:45:16,839 [JobControl] INFO org.apache.hadoop.mapred.YARNRunner - Job jar is not present. Not adding any jar to the list of resources. 2016-07-01 19:45:17,136 [JobControl] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1467387563416_0004 2016-07-01 19:45:17,181 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: &lt;A href="http://hwhdpmaster02.c" target="_blank"&gt;http://hwhdpmaster02.c&lt;/A&gt; 2016-07-01 19:45:17,182 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1467387563416_0004 2016-07-01 19:45:17,182 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a,b,c,d,e,f 2016-07-01 19:45:17,182 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[1,4],e[5,4],f[6,4],b[2,4],c[3,4],d[4,4],f[6,4] C: R: 2016-07-01 19:45:17,197 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2016-07-01 19:45:17,198 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1467387563416_0004] 2016-07-01 19:45:46,336 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete 2016-07-01 19:45:46,336 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1467387563416_0004] 2016-07-01 19:45:47,346 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure. 2016-07-01 19:45:47,346 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1467387563416_0004 has failed! Stop running all dependent jobs 2016-07-01 19:45:47,346 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2016-07-01 19:45:47,517 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: &lt;A href="http://hwhdpm" target="_blank"&gt;http://hwhdpm&lt;/A&gt; 2016-07-01 19:45:47,518 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hwhdpmaster02.centralus.cloudapp.azure.com/10.0.1.5:8050 2016-07-01 19:45:47,528 [main] INFO org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server 2016-07-01 19:45:47,824 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be cast to java.lang.Integer 2016-07-01 19:45:47,824 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed! 2016-07-01 19:45:47,831 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.7.1.2.3.4.0-3485 0.15.0.2.3.4.0-3485 2016-07-01 19:45:14 2016-07-01 19:45:47 HASH_JOIN,FILTER Failed! Failed Jobs: JobId Alias Feature Message Outputs job_1467387563416_0004 a,b,c,d,e,f HASH_JOIN,MULTI_QUERY Message: Job failed! Input(s): Failed to read data from "/pigsample/Employeeinfo.csv" Failed to read data from "/pigsample/Salaryinfo.csv" Output(s): Counters: Total records written : 0 Total bytes written : 0 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1467387563416_0004 -&amp;gt; null, null 2016-07-01 19:45:47,831 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed! 2016-07-01 19:45:47,833 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias g Details at logfile: /home//pig_1467399695184.log grunt&amp;gt;&lt;/P&gt;</description>
      <pubDate>Sat, 02 Jul 2016 03:02:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-ERROR-1066/m-p/109215#M33658</guid>
      <dc:creator>doug_mengistu</dc:creator>
      <dc:date>2016-07-02T03:02:38Z</dc:date>
    </item>
    <item>
      <title>Re: Pig  ERROR 1066</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-ERROR-1066/m-p/109216#M33659</link>
      <description>&lt;P&gt;Here is the solution to your problem &lt;A rel="user" href="https://community.cloudera.com/users/4551/dougmengistu.html" nodeid="4551"&gt;@Dagmawi  Mengistu&lt;/A&gt; &lt;/P&gt;&lt;P&gt;There are two issues over here,&lt;/P&gt;&lt;P&gt;ISSUE 1:&lt;/P&gt;&lt;P&gt;If you check your logs, then after relation "f", you get the "java.lang.ClassCastException".&lt;/P&gt;&lt;P&gt;Please find the updated steps below with explanation of how to resolve this error( Comments are marked with // prefix) - &lt;/P&gt;&lt;P&gt;a = load '/pigsample/Salaryinfo.csv' USING PigStorage(','); &lt;/P&gt;&lt;P&gt;b = load '/pigsample/Employeeinfo.csv' USING PigStorage(','); &lt;/P&gt;&lt;P&gt;c = filter b by $4 =='Male';&lt;/P&gt;&lt;P&gt;// In relation "d", carefully observer that I have type cast the field at index 0 to int, you need to explicitly do type casting like this in order to avoid the "java.lang.ClassCastException". &lt;/P&gt;&lt;P&gt;d = foreach c generate (int)$0 as id:int, $1 as firstname:chararray, $2 as lastname:chararray, $4 as gender:chararray, $6 as city:chararray , $7 as country:chararray, $8 as countrycode:chararray;&lt;/P&gt;&lt;P&gt;// Similarly in relation "e", we have to again explicitly type cast the field iD to int.&lt;/P&gt;&lt;P&gt;e = foreach a generate (int)$0 as iD:int, $1 as firstname:chararray, $2 as lastname:chararray, $3 as salary:double, ToDate($4, 'MM/dd/yyyy') as dateofhire, $5 as company:chararray;&lt;/P&gt;&lt;P&gt;// Relation "f" works perfectly now, doesn't throw any exceptions&lt;/P&gt;&lt;P&gt;f = join d by id, e by iD;&lt;/P&gt;&lt;P&gt;ISSUE 2 - &lt;/P&gt;&lt;P&gt;// In relation "g", you don't need to write f.d::firstname, this will throw org.apache.pig.backend.executionengine.ExecException".&lt;/P&gt;&lt;P&gt;You can directly reference the fields present in relation "f" of relation "d" like this -&lt;/P&gt;&lt;P&gt;g = foreach f generate d::firstname as firstname;&lt;/P&gt;&lt;P&gt;// Print output&lt;/P&gt;&lt;P&gt;DUMG g;&lt;/P&gt;&lt;P&gt;OUTPUT - &lt;/P&gt;&lt;P&gt;(Jonathan) &lt;/P&gt;&lt;P&gt;(Gary) &lt;/P&gt;&lt;P&gt;(Roger)&lt;/P&gt;&lt;P&gt; (Jeffrey) &lt;/P&gt;&lt;P&gt;(Steve)&lt;/P&gt;&lt;P&gt;(Lawrence)&lt;/P&gt;&lt;P&gt; (Billy)&lt;/P&gt;&lt;P&gt; (Joseph)&lt;/P&gt;&lt;P&gt; (Aaron)&lt;/P&gt;&lt;P&gt; (Steve)&lt;/P&gt;&lt;P&gt; (Brian)&lt;/P&gt;&lt;P&gt; (Robert)&lt;/P&gt;&lt;P&gt;Hope this helps &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 02 Jul 2016 21:25:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-ERROR-1066/m-p/109216#M33659</guid>
      <dc:creator>gmarya</dc:creator>
      <dc:date>2016-07-02T21:25:09Z</dc:date>
    </item>
  </channel>
</rss>

