About lenovomi

lenovomi · ‎01-29-2016

one more log: log

lenovomi · ‎01-29-2016

it is failing to write output avro file but log says: web-log says Application Overview User: hdfs Name: PigLatin:pigMerger.pig Application Type: MAPREDUCE Application Tags: YarnApplicationState: FINISHED Queue: default FinalStatus Reported by AM: SUCCEEDED Started: Fri Jan 29 12:59:25 +0000 2016 Elapsed: 4mins, 29sec Tracking URL: History Log Aggregation Status SUCCEEDED Diagnostics:

lenovomi · ‎01-29-2016

@Artem Ervits here you can find one more output - > sources were read successfully but output failed, http://paste.debian.net/377433/

lenovomi · ‎01-29-2016

Ok, waiting on your results! Thank you

lenovomi · ‎01-29-2016

here is the full log: log

lenovomi · ‎01-29-2016

Still failing ;-( Failed Jobs: JobId Alias Feature Message Outputs job_1454023575813_0027 outputSet DISTINCT Message: Job failed! /CustomerData-20160128-1501807, Input(s): Successfully read 100 records from: "/CustomerData-20160128-1501807-l.avro" Successfully read 100 records from: "/CustomerData-20160128-1501807-t.avro" Output(s): Failed to produce result in "/avro-dest/CustomerData-20160128-1501807"

lenovomi · ‎01-29-2016

And now it says ... that i cant read data.. both files are there ... even previous run was Successful with reading the source data... Well im so desperate, this is like working with random turing machine. ;-( How it can fail to read data .... i can easily dump both relations that read data from those input files. Input(s): Failed to read data from "hdfs:///CustomerData-20160128-1501807.avro" Failed to read data from "hdfs:///CustomerData-20160128-1501807.avro" Output(s): Failed to produce result in "hdfs:///CustomerData-20160128-1501807"

lenovomi · ‎01-29-2016

ok so STORE works only with org.apache.pig.piggybank.storage.avro.AvroStorage(.... ) But there are still issues while trying to write output file 2016-01-29 10:09:28,406 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1454023575813_0018 2016-01-29 10:09:28,406 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases outputSet 2016-01-29 10:09:28,406 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: C: R: outputSet[19,12] 2016-01-29 10:10:03,931 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure. 2016-01-29 10:10:03,931 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1454023575813_0018 has failed! Stop running all dependent jobs 2016-01-29 10:10:03,931 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2016-01-29 10:10:06,256 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 2016-01-29 10:10:06,257 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at sandbox.hortonworks.com/10.0.1.47:8050 2016-01-29 10:10:07,417 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 2016-01-29 10:10:07,417 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at sandbox.hortonworks.com/10.0.1.47:8050 2016-01-29 10:10:07,577 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed! 2016-01-29 10:10:07,585 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: Failed Jobs: JobId Alias Feature Message Outputs job_1454023575813_0018 outputSet DISTINCT Message: Job failed! hdfs:///avro-dest/CustomerData-20160128-1501807, Output(s) Failed to produce result in "hdfs:///avro-dest/CustomerData-20160128-1501807" Well i really dont understand whats going on here ... no proper documentation, for me random behavior its really hard to use the tool like that.

lenovomi · ‎01-29-2016

I have another issue with STORE now .... STORE outputSet INTO 'hdfs:///avro-dest/-CustomerData-20160128-1501807'>> USING AvroStorage('no_schema_check', 'schema', '{"type":"record","name":"xxx","fields":[{"name":"name","type":"string","title":"Customer name","description":"non Surrogate Key for joining files on the BDP"}, ....]}'); error below: 2016-01-29 09:48:42,211 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: Pig script failed to parse: <line 20, column 0> pig script failed to validate: java.lang.RuntimeException: could not instantiate 'AvroStorage' with arguments '[no_schema_check, schema, {"type":"record",

lenovomi · ‎01-29-2016

Hi, sorry the set was my mistypo here: outSet = load 'hdfs:///CustomerData-20160128-1501807.avro' USING AvroStorage(); This command works, which is ODD, because whats the different when you call it as AvroStorage() or using full package path org.apache.pig.piggybank.storage.avro.AvroStorage()

Online	Offline
Last Visited	‎10-11-2016 01:41 PM

Member Since	‎01-07-2016 01:00 PM
Last Visited	‎10-11-2016 01:41 PM
Posts	89
Kudos received	20

Cloudera Community

Re: read a AVRO file stored in HDFS

Re: AvroStorage with mapreduce and java.lang.Runti...

Re: Sandbox - web ui problem

Re: AvroStorage

Re: AvroStorage - output file name definition

Re: AvroStorage with mapreduce and java.lang.Runti...

Re: AvroStorage with mapreduce and java.lang.Runti...

Re: AvroStorage with mapreduce and java.lang.Runti...

Re: AvroStorage with mapreduce and java.lang.Runti...

Re: AvroStorage with mapreduce and java.lang.Runti...

Re: AvroStorage with mapreduce and java.lang.Runti...

Re: AvroStorage with mapreduce and java.lang.Runti...

Re: AvroStorage with mapreduce and java.lang.Runti...

Re: AvroStorage with mapreduce and java.lang.Runti...

Re: AvroStorage with mapreduce and java.lang.Runti...