Member since 
    
	
		
		
		09-24-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                527
            
            
                Posts
            
        
                136
            
            
                Kudos Received
            
        
                19
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2882 | 06-30-2017 03:15 PM | |
| 4357 | 10-14-2016 10:08 AM | |
| 9584 | 09-07-2016 06:04 AM | |
| 11622 | 08-26-2016 11:27 AM | |
| 1910 | 08-23-2016 02:09 PM | 
			
    
	
		
		
		02-16-2016
	
		
		07:53 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi:  Finally it worked like this:  e = load 'hdfs://localhost:8020/tmp/jofi_pig_temp' using PigStorage(',') AS (codtf : chararray,codnrbeenf : chararray, fechaoprcnf : chararray, codinternouof : chararray, year : chararray, month : chararray, frecuencia : int);  Many thanks 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-16-2016
	
		
		07:47 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi:  I am running one job from RStudio and y get this error:  16/02/16 13:01:30 INFO mapreduce.Job:  map 100% reduce 24%
16/02/16 13:12:22 INFO mapreduce.Job:  map 100% reduce 100%
16/02/16 13:12:22 INFO mapreduce.Job: Task Id : attempt_1455198426748_0476_r_000000_0, Status : FAILED
Container [pid=18361,containerID=container_e24_1455198426748_0476_01_000499] is running beyond physical memory limits. Current usage: 7.1 GB of 7 GB physical memory used; 12.9 GB of 14.7 GB virtual memory used. Killing container.
Dump of the process-tree for container_e24_1455198426748_0476_01_000499 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 18377 18361 18361 18361 (java) 3320 733 8102256640 609337 /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_0 26388279067123 
	|- 19618 18377 18361 18361 (R) 96691 1583 5403787264 1249728 /usr/lib64/R/bin/exec/R --slave --no-restore --vanilla --file=./rmr-streaming-combinefd060b81bfd 
	|- 19629 19618 18361 18361 (cat) 0 0 103407616 166 cat 
	|- 18361 18359 18361 18361 (bash) 0 0 108617728 341 /bin/bash -c /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_0 26388279067123 1>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499/stdout 2>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_000499/stderr  
	|- 19627 19618 18361 18361 (cat) 1 48 103407616 174 cat 
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
16/02/16 13:12:23 INFO mapreduce.Job:  map 100% reduce 0%
16/02/16 13:12:34 INFO mapreduce.Job:  map 100% reduce 15%
16/02/16 13:12:37 INFO mapreduce.Job:  map 100% reduce 21%
16/02/16 13:12:40 INFO mapreduce.Job:  map 100% reduce 24%
16/02/16 13:28:26 INFO mapreduce.Job: Task Id : attempt_1455198426748_0476_r_000000_1, Status : FAILED
Container [pid=21694,containerID=container_e24_1455198426748_0476_01_001310] is running beyond physical memory limits. Current usage: 7.1 GB of 7 GB physical memory used; 12.6 GB of 14.7 GB virtual memory used. Killing container.
Dump of the process-tree for container_e24_1455198426748_0476_01_001310 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 21694 21692 21694 21694 (bash) 0 0 108617728 341 /bin/bash -c /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_1 26388279067934 1>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310/stdout 2>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310/stderr  
	|- 21781 21704 21694 21694 (R) 93564 1394 5118803968 1185913 /usr/lib64/R/bin/exec/R --slave --no-restore --vanilla --file=./rmr-streaming-combinefd060b81bfd 
	|- 21807 21781 21694 21694 (cat) 0 43 103407616 173 cat 
	|- 21704 21694 21694 21694 (java) 2526 787 8089718784 664117 /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001310 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_1 26388279067934 
	|- 21810 21781 21694 21694 (cat) 0 0 103407616 166 cat 
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
16/02/16 13:28:27 INFO mapreduce.Job:  map 100% reduce 0%
16/02/16 13:28:38 INFO mapreduce.Job:  map 100% reduce 16%
16/02/16 13:28:41 INFO mapreduce.Job:  map 100% reduce 20%
16/02/16 13:28:44 INFO mapreduce.Job:  map 100% reduce 24%
16/02/16 13:46:02 INFO mapreduce.Job: Task Id : attempt_1455198426748_0476_r_000000_2, Status : FAILED
Container [pid=23643,containerID=container_e24_1455198426748_0476_01_001311] is running beyond physical memory limits. Current usage: 7.1 GB of 7 GB physical memory used; 12.8 GB of 14.7 GB virtual memory used. Killing container.
Dump of the process-tree for container_e24_1455198426748_0476_01_001311 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 23737 23729 23643 23643 (cat) 0 44 103407616 174 cat 
	|- 23738 23729 23643 23643 (cat) 0 0 103407616 166 cat 
	|- 23729 23653 23643 23643 (R) 101777 1652 5376724992 1248882 /usr/lib64/R/bin/exec/R --slave --no-restore --vanilla --file=./rmr-streaming-combinefd060b81bfd 
	|- 23653 23643 23643 23643 (java) 2328 784 8079331328 617129 /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_2 26388279067935 
	|- 23643 23641 23643 23643 (bash) 0 0 108617728 341 /bin/bash -c /usr/jdk64/jdk1.8.0_40/bin/java -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.2.0-2950 -Xmx5734m -Djava.io.tmpdir=/hadoop/yarn/local/usercache/dangulo/appcache/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Dhadoop.root.logfile=syslog -Dyarn.app.mapreduce.shuffle.logger=INFO,shuffleCLA -Dyarn.app.mapreduce.shuffle.logfile=syslog.shuffle -Dyarn.app.mapreduce.shuffle.log.filesize=0 -Dyarn.app.mapreduce.shuffle.log.backups=0 org.apache.hadoop.mapred.YarnChild 10.1.246.16 42940 attempt_1455198426748_0476_r_000000_2 26388279067935 1>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311/stdout 2>/hadoop/yarn/log/application_1455198426748_0476/container_e24_1455198426748_0476_01_001311/stderr  
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
16/02/16 13:46:03 INFO mapreduce.Job:  map 100% reduce 0%
16/02/16 13:46:15 INFO mapreduce.Job:  map 100% reduce 17%
16/02/16 13:46:18 INFO mapreduce.Job:  map 100% reduce 22%
16/02/16 13:46:21 INFO mapreduce.Job:  map 100% reduce 24%
16/02/16 13:59:00 INFO mapreduce.Job:  map 100% reduce 100%
16/02/16 13:59:00 INFO mapreduce.Job: Job job_1455198426748_0476 failed with state FAILED due to: Task failed task_1455198426748_0476_r_000000
Job failed as tasks failed. failedMaps:0 failedReduces:1
16/02/16 13:59:00 INFO mapreduce.Job: Counters: 39
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=2064381938
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=13416462815
		HDFS: Number of bytes written=0
		HDFS: Number of read operations=321
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=0
	Job Counters 
		Failed reduce tasks=4
		Launched map tasks=107
		Launched reduce tasks=4
		Data-local map tasks=107
		Total time spent by all maps in occupied slots (ms)=37720330
		Total time spent by all reduces in occupied slots (ms)=7956034
		Total time spent by all map tasks (ms)=18860165
		Total time spent by all reduce tasks (ms)=3978017
		Total vcore-seconds taken by all map tasks=18860165
		Total vcore-seconds taken by all reduce tasks=3978017
		Total megabyte-seconds taken by all map tasks=77251235840
		Total megabyte-seconds taken by all reduce tasks=28514425856
	Map-Reduce Framework
		Map input records=99256589
		Map output records=321
		Map output bytes=2050220619
		Map output materialized bytes=2050222738
		Input split bytes=12519
		Combine input records=321
		Combine output records=321
		Spilled Records=321
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=151580
		CPU time spent (ms)=4098800
		Physical memory (bytes) snapshot=256365596672
		Virtual memory (bytes) snapshot=538256474112
		Total committed heap usage (bytes)=286838489088
	File Input Format Counters 
		Bytes Read=13416450296
	rmr
		reduce calls=107
16/02/16 13:59:00 ERROR streaming.StreamJob: Job not successful!
  I think its for the memory  but i thinks its for the R program, because the same job from pig worked well, any suggestion??  Thanks 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache YARN
 - 
						
							
		
			Cloudera Manager
 
			
    
	
		
		
		02-15-2016
	
		
		06:02 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi:  STORE E INTO 'hdfs://lnxbig05.cajarural.gcr:8020/tmp/journey_pig_temp' using PigStorage(',');
F = load 'hdfs://lnxbig05.cajarural.gcr:8020/tmp/journey_pig_temp/pig_temp.out' using PigStorage(',');
G = foreach A generate $0, $1, $2, $3, $4, $5, $6, $7;
dump G;
STORE G INTO 'default.journey_pig' USING  org.apache.hive.hcatalog.pig.HCatStorer();
  this is the hdfs://lnxbig05.cajarural.gcr:8020/tmp/journey_pig_temp file  BDP00SMU,1491,2015-12-06 00,9901,2015,12,1
BDP00SMU,3113,2015-12-06 00,8004,2015,12,1
BDP00SMU,3187,2015-12-06 00,0913,2015,12,1
BDP00SMU,3190,2015-12-06 00,9992,2015,12,1
BDPPM1GP,3008,2015-12-06 00,9521,2015,12,17
BDPPM1HC,3128,2015-12-06 00,8110,2015,12,32
BDPPM1KK,0198,2015-12-06 00,8002,2015,12,1
BDPPM1KK,3008,2015-12-06 00,9521,2015,12,3
BDPPM1KK,3008,2015-12-06 00,9523,2015,12,6
  The dump G is:  ([COD-NRBE-EN-F#9998,NOMBRE-REGLA-F#SAI_TIP_INC_TRN,FECHA-OPRCN-F#2015-12-06 00:00:01,COD-TX-DI-F#TUX,VALOR-IMP-F#0.00,ID-INTERNO-TERM-TN-F#A0299989,COD-NRBE-EN-FSC-F#9998,COD-CSB-OF-F#0001,COD-TX-F#SAI01COU,COD-INTERNO-UO-F#0001,CANAL#01,COD-INTERNO-UO-FSC-F#0001,COD-IDENTIFICACION-F#,IDENTIFICACION-F#,ID-INTERNO-EMPL-EP-F#99999989,FECHA-CTBLE-F#2015-12-07,NUM-SEC-F#764,ID-EMPL-AUT-F#U028765,COD-CENT-UO-F#,NUM-PARTICION-F#001],,,,,,,)
([COD-NRBE-EN-F#9998,NOMBRE-REGLA-F#TR_IMPUTAC_MPAGO_TRN,FECHA-OPRCN-F#2015-12-06 00:00:06,COD-TX-DI-F#TUX,VALOR-IMP-F#0.00,ID-INTERNO-TERM-TN-F#A0299997,COD-NRBE-EN-FSC-F#9998,COD-CSB-OF-F#0001,COD-TX-F#DVI82OOU,COD-INTERNO-UO-F#0001,CANAL#01,COD-INTERNO-UO-FSC-F#0001,COD-IDENTIFICACION-F#,IDENTIFICACION-F#,ID-INTERNO-EMPL-EP-F#99999998,FECHA-CTBLE-F#2015-12-07,NUM-SEC-F#0,ID-EMPL-AUT-F#,COD-CENT-UO-F#,NUM-PARTICION-F#001],,,,,,,)
  the error is when iam going to store de G to the Hive table, here is the file on the hdfs, and also the G variable, i dont know what that mean the error:  2016-02-15 18:59:29,529 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1115: Column name for a field is not specified. Please provide the full schema as an argument to HCatStorer.
2016-02-15 18:59:29,529 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.plan.VisitorException: ERROR 1115:
<file store_journey.pig, line 48, column 0> Output Location Validation Failed for: 'default.journey_pig More info to follow:
Column name for a field is not specified. Please provide the full schema as an argument to HCatStorer.
        at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:64)
        at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
        at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
        at org.apache.pig.newplan.logical.relational.LogicalPlan.validate(LogicalPlan.java:212)
        at org.apache.pig.PigServer$Graph.compile(PigServer.java:1767)
        at org.apache.pig.PigServer$Graph.access$300(PigServer.java:1443)
        at org.apache.pig.PigServer.execute(PigServer.java:1356)
        at org.apache.pig.PigServer.executeBatch(PigServer.java:415)
        at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
        at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:234)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
        at org.apache.pig.Main.run(Main.java:502)
        at org.apache.pig.Main.main(Main.java:177)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1115: Column name for a field is not specified. Please provide the full schema as an argument to HCatStorer.
        at org.apache.hive.hcatalog.pig.HCatBaseStorer.validateAlias(HCatBaseStorer.java:612)
        at org.apache.hive.hcatalog.pig.HCatBaseStorer.validateSchema(HCatBaseStorer.java:514)
        at org.apache.hive.hcatalog.pig.HCatBaseStorer.doSchemaValidations(HCatBaseStorer.java:495)
        at org.apache.hive.hcatalog.pig.HCatStorer.setStoreLocation(HCatStorer.java:201)
        at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:57)
        ... 24 more
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-15-2016
	
		
		04:40 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi:  my problem is that, i created F like this:  E = FOREACH D GENERATE
    FLATTEN(group),
    COUNT(C);
  and y dont know where y put in the flatten the parameter for the colum.  Many thanks egain 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-15-2016
	
		
		04:27 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I am sorry the E contain this:     (STR03CON,3190,2015-12-0600,9992,2015,12,1)  (STS01OON,3081,2015-12-0600,9154,2015,12,1)  (VAO13MOU,3076,2015-12-0600,9554,2015,12,1)  (VMP71MOU,9998,2015-12-0600,0001,2015,12,11)  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-15-2016
	
		
		04:25 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi:  After get data with pig from hive, now i am inserting with this command  F = STORE E INTO 'journey_pig' USING org.apache.hive.hcatalog.pig.HCatStorer();  the F has this records:  (STR03CON,3190,2015-12-06 00,9992,2015,12,1)
(STS01OON,3081,2015-12-06 00,9154,2015,12,1)
(VAO13MOU,3076,2015-12-06 00,9554,2015,12,1)
(VMP71MOU,9998,2015-12-06 00,0001,2015,12,11)
  and the error is:  2016-02-15 17:22:42,483 [main] ERROR org.apache.pig.tools.grunt.Grunt - org.apache.pig.impl.plan.VisitorException: ERROR 1115:
<file store_journey.pig, line 36, column 4> Output Location Validation Failed for: 'journey_pig More info to follow:
Column name for a field is not specified. Please provide the full schema as an argument to HCatStorer.
        at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:64)
        at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
        at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
        at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
        at org.apache.pig.newplan.logical.relational.LogicalPlan.validate(LogicalPlan.java:212)
        at org.apache.pig.PigServer$Graph.compile(PigServer.java:1767)
        at org.apache.pig.PigServer$Graph.access$300(PigServer.java:1443)
        at org.apache.pig.PigServer.execute(PigServer.java:1356)
        at org.apache.pig.PigServer.executeBatch(PigServer.java:415)
        at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
        at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:749)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:376)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
        at org.apache.pig.Main.run(Main.java:502)
        at org.apache.pig.Main.main(Main.java:177)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1115: Column name for a field is not specified. Please provide the full schema as an argument to HCatStorer.
        at org.apache.hive.hcatalog.pig.HCatBaseStorer.validateAlias(HCatBaseStorer.java:612)
        at org.apache.hive.hcatalog.pig.HCatBaseStorer.validateSchema(HCatBaseStorer.java:514)
        at org.apache.hive.hcatalog.pig.HCatBaseStorer.doSchemaValidations(HCatBaseStorer.java:495)
        at org.apache.hive.hcatalog.pig.HCatStorer.setStoreLocation(HCatStorer.java:201)
        at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:57)
        ... 29 more
  Where i need to put the column name???  Than ks 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Hive
 - 
						
							
		
			Apache Pig
 
			
    
	
		
		
		02-15-2016
	
		
		03:20 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi:  it work with me like this, many thanks 🙂  register /usr/lib/piggybank/hive-hcatalog-pig-adapter.jar
register /usr/lib/piggybank/hive-common.jar;
register /usr/lib/piggybank/hive-metastore.jar;
register /usr/lib/piggybank/hive-exec.jar;
register /usr/lib/piggybank/hive-serde.jar;
register /usr/lib/piggybank/hive-shims.jar;
register /usr/lib/piggybank/libfb303.jar
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-15-2016
	
		
		12:13 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Many thanks, the pi version is :  Pig0.15.0.2.3  and the Hive version is:  Hive1.2.1.2   This is correct???  Thanks. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-15-2016
	
		
		12:04 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi:  After do this :  egister /usr/hdp/current/hive-webhcat/share/hcatalog/hive-hcatalog-core-1.2.1.2.3.2.0-2950.jar;
register /usr/lib/piggybank/hcatalog-pig-adapter-0.11.0.jar;
register /usr/lib/piggybank/hcatalog-core-0.11.0.jar;
register /usr/lib/piggybank/hive-shims-0.11.0.jar;
.
.
.
 F = LOAD 'journey_pig' USING org.apache.hive.hcatalog.pig.HCatLoader();
  ***The hive-shims-0.11.0.jar contain this classs like this:  public abstract UserGroupInformation getUGIForConf(Configuration paramConfiguration)
    throws LoginException, IOException;
  i receive this error:  2016-02-15 12:54:42,359 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org.apache.hadoop.hive.shims.HadoopShims.getUGIForConf(Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/security/UserGroupInformation;
2016-02-15 12:54:42,359 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.NoSuchMethodError: org.apache.hadoop.hive.shims.HadoopShims.getUGIForConf(Lorg/apache/hadoop/conf/Configuration;)Lorg/apache/hadoop/security/UserGroupInformation;
        at org.apache.hcatalog.common.HiveClientCache$HiveClientCacheKey.<init>(HiveClientCache.java:201)
        at org.apache.hcatalog.common.HiveClientCache$HiveClientCacheKey.fromHiveConf(HiveClientCache.java:207)
        at org.apache.hcatalog.common.HiveClientCache.get(HiveClientCache.java:138)
        at org.apache.hcatalog.common.HCatUtil.getHiveClient(HCatUtil.java:544)
        at org.apache.hcatalog.pig.PigHCatUtil.getHiveMetaClient(PigHCatUtil.java:147)
        at org.apache.hcatalog.pig.PigHCatUtil.getTable(PigHCatUtil.java:183)
        at org.apache.hcatalog.pig.HCatLoader.getSchema(HCatLoader.java:193)
        at org.apache.pig.newplan.logical.relational.LOLoad.getSchemaFromMetaData(LOLoad.java:175)
        at org.apache.pig.newplan.logical.relational.LOLoad.<init>(LOLoad.java:89)
        at org.apache.pig.parser.LogicalPlanBuilder.buildLoadOp(LogicalPlanBuilder.java:901)
        at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3568)
        at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1625)
        at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102)
        at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560)
        at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421)
        at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191)
        at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1735)
        at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1443)
        at org.apache.pig.PigServer.parseAndBuild(PigServer.java:387)
        at org.apache.pig.PigServer.executeBatch(PigServer.java:412)
        at org.apache.pig.PigServer.executeBatch(PigServer.java:398)
        at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:171)
        at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:749)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:376)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
        at org.apache.pig.Main.run(Main.java:502)
        at org.apache.pig.Main.main(Main.java:177)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
  Please any suggestions??  Many thanks 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache HCatalog
 - 
						
							
		
			Apache Hive
 - 
						
							
		
			Apache Pig
 
			
    
	
		
		
		02-12-2016
	
		
		12:43 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi:  I dont undestand why:  "If you have 256MB blocks you need 5 tasks. They will take 10+1 = 11 minutes and will be slower. So 128MB blocks are faster." 
						
					
					... View more