Created 09-27-2016 12:54 PM
Hello,
I am stepping through this part of the HDP 2.5 tutorial:
I have executed this statement in the Hive view in Ambari under maria_dev:
CREATE TABLE riskfactor (driverid string,events bigint,totmiles bigint,riskfactor float) STORED AS ORC;
I have checked the table to be present in the default db and it is there.
After executing the following pig script:
a = LOAD 'geolocation' using org.apache.hive.hcatalog.pig.HCatLoader();
b = filter a by event != 'normal';
c = foreach b generate driverid, event, (int) '1' as occurance;
d = group c by driverid;
e = foreach d generate group as driverid, SUM(c.occurance) as t_occ;
g = LOAD 'drivermileage' using org.apache.hive.hcatalog.pig.HCatLoader();
h = join e by driverid, g by driverid;
final_data = foreach h generate $0 as driverid, $1 as events, $3 as totmiles, (float) $3/$1 as riskfactor;
store final_data into 'riskfactor' using org.apache.hive.hcatalog.pig.HCatStorer();
I get the following errors:
ls: cannot access /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/hive.tar.gz/hive/lib/slf4j-api-*.jar: No such file or directory ls: cannot access /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/hive.tar.gz/hive/hcatalog/lib/*hbase-storage-handler-*.jar: No such file or directory WARNING: Use "yarn jar" to launch YARN applications. 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Trying ExecType : TEZ_LOCAL 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Trying ExecType : TEZ 16/09/27 11:51:21 INFO pig.ExecTypeProvider: Picked TEZ as the ExecType 2016-09-27 11:51:21,605 [main] INFO org.apache.pig.Main - Apache Pig version 0.16.0.2.5.0.0-1245 (rexported) compiled Aug 26 2016, 02:07:35 2016-09-27 11:51:21,605 [main] INFO org.apache.pig.Main - Logging error messages to: /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/pig_1474977081603.log 2016-09-27 11:51:23,260 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/yarn/.pigbootup not found 2016-09-27 11:51:23,453 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://sandbox.hortonworks.com:8020 2016-09-27 11:51:24,818 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-script.pig-8ca435c7-920a-4f44-953e-454a42973ab8 2016-09-27 11:51:25,478 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 2016-09-27 11:51:25,671 [main] INFO org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook 2016-09-27 11:51:27,037 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-09-27 11:51:27,107 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-09-27 11:51:27,170 [main] INFO hive.metastore - Connected to metastore. 2016-09-27 11:51:27,904 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-09-27 11:51:27,906 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-09-27 11:51:27,909 [main] INFO hive.metastore - Connected to metastore. 2016-09-27 11:51:28,140 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_FLOAT 1 time(s). 2016-09-27 11:51:28,237 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-09-27 11:51:28,317 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-09-27 11:51:28,325 [main] INFO hive.metastore - Connected to metastore. 2016-09-27 11:51:28,723 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 0: <file script.pig, line 9, column 0> Output Location Validation Failed for: 'riskfactor More info to follow: Pig 'double' type in column 2(0-based) cannot map to HCat 'BIGINT'type. Target filed must be of HCat type {DOUBLE} Details at logfile: /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/pig_1474977081603.log 2016-09-27 11:51:28,746 [main] INFO org.apache.pig.Main - Pig script completed in 7 seconds and 330 milliseconds (7330 ms)
When I executed the script for the very first time I did not see any errors, but the riskfactor table was still empty and should have been populated.
Is there somebody that can help?
Created 09-27-2016 02:34 PM
It looks like there is a data type mismatch according to the error:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 0: Output Location Validation Failed for: 'riskfactor More info to follow: Pig 'double' type in column 2(0-based) cannot map to HCat 'BIGINT'type. Target filed must be of HCat type {DOUBLE} Details at logfile: /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/pig_1474977081603.log 2016-09-27 11:51:28,746 [main] INFO org.apache.pig.Main - Pig script completed in 7 seconds and 330 milliseconds (7330 ms
The log indicates that it's attempting to store a DOUBLE into a target column that should be a BIGINT. It saying "in column 2(0-based)", so the problem is with totmiles.
Created 09-27-2016 02:34 PM
It looks like there is a data type mismatch according to the error:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 0: Output Location Validation Failed for: 'riskfactor More info to follow: Pig 'double' type in column 2(0-based) cannot map to HCat 'BIGINT'type. Target filed must be of HCat type {DOUBLE} Details at logfile: /hadoop/yarn/local/usercache/maria_dev/appcache/application_1474973150203_0003/container_1474973150203_0003_01_000002/pig_1474977081603.log 2016-09-27 11:51:28,746 [main] INFO org.apache.pig.Main - Pig script completed in 7 seconds and 330 milliseconds (7330 ms
The log indicates that it's attempting to store a DOUBLE into a target column that should be a BIGINT. It saying "in column 2(0-based)", so the problem is with totmiles.
Created 09-27-2016 03:52 PM
Hi @Robbert Naastepad, as spotted by @Michael Young, you can try changing the data type of totmiles variable to double. Drop the table riskfactor from HIve and create it again with:
drop table riskfactor;
CREATE TABLE riskfactor (driverid string,events bigint,totmiles double,riskfactor float) STORED AS ORC;
Let us know if this works.
Created 09-28-2016 04:53 AM
It worked for me on resolving the same error. Thank you @mrizvi and @Michael Young
Created 09-28-2016 06:44 AM
Created 09-28-2016 01:41 PM
No worries!
Created 10-25-2016 06:06 AM
I was getting below error:
2016-10-25 05:19:47,348 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 0: <file script.pig, line 9, column 0> Output Location Validation Failed for: 'riskfactor More info to follow: Pig 'long' type in column 2(0-based) cannot map to HCat 'DOUBLE'type. Target filed must be of HCat type {BIGINT} Details at logfile:
So I changed line# 8 by casting $3 as double. This worked just fine for me.
final_data = foreach h generate $0 as driverid, $1 as events, (double) $3 as totmiles, (float) $3/$1 as riskfactor;
Created 11-29-2016 03:30 PM
Hi,
Does not work. I dropped table RISKFACTOR, created a new one with TOTMILES as Double...
Still the issue:
ls: cannot access /hadoop/yarn/local/usercache/admin/appcache/application_1480244541051_0024/container_1480244541051_0024_01_000002/hive.tar.gz/hive/lib/slf4j-api-*.jar: No such file or directory ls: cannot access /hadoop/yarn/local/usercache/admin/appcache/application_1480244541051_0024/container_1480244541051_0024_01_000002/hive.tar.gz/hive/hcatalog/lib/*hbase-storage-handler-*.jar: No such file or directory WARNING: Use "yarn jar" to launch YARN applications. 16/11/29 15:48:25 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL 16/11/29 15:48:25 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE 16/11/29 15:48:25 INFO pig.ExecTypeProvider: Trying ExecType : TEZ_LOCAL 16/11/29 15:48:25 INFO pig.ExecTypeProvider: Trying ExecType : TEZ 16/11/29 15:48:25 INFO pig.ExecTypeProvider: Picked TEZ as the ExecType 2016-11-29 15:48:25,281 [main] INFO org.apache.pig.Main - Apache Pig version 0.16.0.2.5.0.0-1245 (rexported) compiled Aug 26 2016, 02:07:35 2016-11-29 15:48:25,281 [main] INFO org.apache.pig.Main - Logging error messages to: /hadoop/yarn/local/usercache/admin/appcache/application_1480244541051_0024/container_1480244541051_0024_01_000002/pig_1480434505279.log 2016-11-29 15:48:26,196 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /home/yarn/.pigbootup not found 2016-11-29 15:48:26,372 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://sandbox.hortonworks.com:8020 2016-11-29 15:48:27,388 [main] INFO org.apache.pig.PigServer - Pig Script ID for the session: PIG-script.pig-e1155078-b7bf-4f84-b9e7-b3f427858f9b 2016-11-29 15:48:27,787 [main] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 2016-11-29 15:48:27,910 [main] INFO org.apache.pig.backend.hadoop.PigATSClient - Created ATS Hook 2016-11-29 15:48:28,659 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-11-29 15:48:28,709 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-11-29 15:48:28,771 [main] INFO hive.metastore - Connected to metastore. 2016-11-29 15:48:29,451 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-11-29 15:48:29,453 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-11-29 15:48:29,455 [main] INFO hive.metastore - Connected to metastore. 2016-11-29 15:48:29,586 [main] WARN org.apache.pig.newplan.BaseOperatorPlan - Encountered Warning IMPLICIT_CAST_TO_FLOAT 1 time(s). 2016-11-29 15:48:29,664 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-11-29 15:48:29,700 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-11-29 15:48:29,705 [main] INFO hive.metastore - Connected to metastore. 2016-11-29 15:48:30,169 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-11-29 15:48:30,189 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-11-29 15:48:30,337 [main] INFO hive.metastore - Connected to metastore. 2016-11-29 15:48:30,496 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,GROUP_BY,FILTER 2016-11-29 15:48:30,555 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2016-11-29 15:48:30,601 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]} 2016-11-29 15:48:30,675 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) of size 174587904 to monitor. collectionUsageThreshold = 122211528, usageThreshold = 122211528 2016-11-29 15:48:30,746 [main] INFO org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned for a: $0, $3, $4, $5, $6, $7, $8, $9 2016-11-29 15:48:30,889 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - Tez staging directory is /tmp/yarn/staging and resources directory is /tmp/temp-293241078 2016-11-29 15:48:30,937 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.plan.TezCompiler - File concatenation threshold: 100 optimistic? false 2016-11-29 15:48:30,985 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.CombinerOptimizerUtil - Choosing to move algebraic foreach to combiner 2016-11-29 15:48:31,099 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-11-29 15:48:31,112 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-11-29 15:48:31,114 [main] INFO hive.metastore - Connected to metastore. 2016-11-29 15:48:31,280 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=OrcGetSplits from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl> 2016-11-29 15:48:31,293 [main] INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat - ORC pushdown predicate: null 2016-11-29 15:48:31,313 [main] INFO org.apache.hadoop.hive.ql.io.orc.OrcUtils - Using schema evolution configuration variables schema.evolution.columns [truckid, driverid, event, latitude, longitude, city, state, velocity, event_ind, idling_ind] / schema.evolution.columns.types [string, string, string, double, double, string, string, int, int, int] (isAcid false) 2016-11-29 15:48:31,775 [main] INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat - FooterCacheHitRatio: 0/2 2016-11-29 15:48:31,775 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=OrcGetSplits start=1480434511280 end=1480434511775 duration=495 from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl> 2016-11-29 15:48:31,779 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2016-11-29 15:48:31,894 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-11-29 15:48:31,898 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-11-29 15:48:31,903 [main] INFO hive.metastore - Connected to metastore. 2016-11-29 15:48:31,989 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - <PERFLOG method=OrcGetSplits from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl> 2016-11-29 15:48:31,989 [main] INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat - ORC pushdown predicate: null 2016-11-29 15:48:31,989 [main] INFO org.apache.hadoop.hive.ql.io.orc.OrcUtils - Using schema evolution configuration variables schema.evolution.columns [driverid, totmiles] / schema.evolution.columns.types [string, double] (isAcid false) 2016-11-29 15:48:32,017 [main] INFO org.apache.hadoop.hive.ql.io.orc.OrcInputFormat - FooterCacheHitRatio: 0/2 2016-11-29 15:48:32,018 [main] INFO org.apache.hadoop.hive.ql.log.PerfLogger - </PERFLOG method=OrcGetSplits start=1480434511989 end=1480434512017 duration=28 from=org.apache.hadoop.hive.ql.io.orc.ReaderImpl> 2016-11-29 15:48:32,018 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2016-11-29 15:48:33,077 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hive-metastore-1.2.1000.2.5.0.0-1245.jar 2016-11-29 15:48:33,077 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: jdo-api-3.0.1.jar 2016-11-29 15:48:33,077 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar 2016-11-29 15:48:33,077 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hive-hcatalog-pig-adapter-1.2.1000.2.5.0.0-1245.jar 2016-11-29 15:48:33,077 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: pig-0.16.0.2.5.0.0-1245-core-h2.jar 2016-11-29 15:48:33,077 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: libfb303-0.9.3.jar 2016-11-29 15:48:33,078 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: antlr-runtime-3.4.jar 2016-11-29 15:48:33,078 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hive-exec-1.2.1000.2.5.0.0-1245.jar 2016-11-29 15:48:33,078 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: hive-hbase-handler-1.2.1000.2.5.0.0-1245.jar 2016-11-29 15:48:33,078 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: libthrift-0.9.3.jar 2016-11-29 15:48:33,078 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Local resource: automaton-1.11-8.jar 2016-11-29 15:48:33,520 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - For vertex - scope-52: parallelism=1, memory=256, java opts=-Xmx256m 2016-11-29 15:48:33,521 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Processing aliases: a,b,c,d,e 2016-11-29 15:48:33,521 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Detailed locations: a[1,4],b[2,4],c[3,4],e[5,4],d[4,4] 2016-11-29 15:48:33,521 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Pig features in the vertex: 2016-11-29 15:48:33,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Set auto parallelism for vertex scope-53 2016-11-29 15:48:33,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - For vertex - scope-53: parallelism=1, memory=256, java opts=-Xmx256m 2016-11-29 15:48:33,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Processing aliases: e,h 2016-11-29 15:48:33,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Detailed locations: e[5,4],h[7,4] 2016-11-29 15:48:33,628 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Pig features in the vertex: GROUP_BY 2016-11-29 15:48:33,791 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - For vertex - scope-54: parallelism=1, memory=256, java opts=-Xmx256m 2016-11-29 15:48:33,791 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Processing aliases: g,h 2016-11-29 15:48:33,791 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Detailed locations: g[6,4],h[7,4] 2016-11-29 15:48:33,791 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Pig features in the vertex: 2016-11-29 15:48:33,863 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Set auto parallelism for vertex scope-55 2016-11-29 15:48:33,863 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - For vertex - scope-55: parallelism=2, memory=256, java opts=-Xmx256m 2016-11-29 15:48:33,863 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Processing aliases: final_data,h 2016-11-29 15:48:33,863 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Detailed locations: h[7,4],final_data[8,13] 2016-11-29 15:48:33,863 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezDagBuilder - Pig features in the vertex: HASH_JOIN 2016-11-29 15:48:33,973 [main] WARN org.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.metastore.local does not exist 2016-11-29 15:48:33,987 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083 2016-11-29 15:48:33,989 [main] INFO hive.metastore - Connected to metastore. 2016-11-29 15:48:34,040 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJobCompiler - Total estimated parallelism is 5 2016-11-29 15:48:34,121 [PigTezLauncher-0] INFO org.apache.pig.tools.pigstats.tez.TezScriptState - Pig script settings are added to the job 2016-11-29 15:48:34,122 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager - Increasing tez.am.resource.memory.mb from 256 to 1024 as total estimated tasks = 5, total vertices = 4, max outputs = 1 2016-11-29 15:48:34,122 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager - Increasing Tez AM Heap Size from 0M to 512M as total estimated tasks = 5, total vertices = 4, max outputs = 1 2016-11-29 15:48:34,122 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager - Value of tez.am.launch.cmd-opts is now -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC -Xmx512M 2016-11-29 15:48:34,153 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Tez Client Version: [ component=tez-api, version=0.7.0.2.5.0.0-1245, revision=c98dc048175afd3f56a44f05a1c18c6813f0b9a4, SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, buildTime=2016-08-26T01:23:50Z ] 2016-11-29 15:48:34,367 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 2016-11-29 15:48:34,382 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at sandbox.hortonworks.com/172.17.0.2:8050 2016-11-29 15:48:34,531 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.AHSProxy - Connecting to Application History server at sandbox.hortonworks.com/172.17.0.2:10200 2016-11-29 15:48:34,541 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Using org.apache.tez.dag.history.ats.acls.ATSV15HistoryACLPolicyManager to manage Timeline ACLs 2016-11-29 15:48:34,673 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 2016-11-29 15:48:34,679 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Session mode. Starting session. 2016-11-29 15:48:34,682 [PigTezLauncher-0] INFO org.apache.tez.common.security.TokenCache - Merging additional tokens from binary file, binaryFileName=/hadoop/yarn/local/usercache/admin/appcache/application_1480244541051_0024/container_1480244541051_0024_01_000002/container_tokens 2016-11-29 15:48:34,683 [PigTezLauncher-0] INFO org.apache.tez.client.TezClientUtils - Using tez.lib.uris value from configuration: /hdp/apps/2.5.0.0-1245/tez/tez.tar.gz 2016-11-29 15:48:34,754 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Stage directory /tmp/yarn/staging doesn't exist and is created 2016-11-29 15:48:34,780 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Tez system stage directory hdfs://sandbox.hortonworks.com:8020/tmp/yarn/staging/.tez/application_1480244541051_0025 doesn't exist and is created 2016-11-29 15:48:34,821 [PigTezLauncher-0] INFO org.apache.tez.dag.history.ats.acls.ATSV15HistoryACLPolicyManager - Created Timeline Domain for History ACLs, domainId=Tez_ATS_application_1480244541051_0025 2016-11-29 15:48:34,975 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1480244541051_0025 2016-11-29 15:48:34,981 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - The url to track the Tez Session: http://sandbox.hortonworks.com:8088/proxy/application_1480244541051_0025/ 2016-11-29 15:48:41,690 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - Submitting DAG PigLatin:script.pig-0_scope-0 2016-11-29 15:48:41,690 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Submitting dag to TezSession, sessionName=PigLatin:script.pig, applicationId=application_1480244541051_0025, dagName=PigLatin:script.pig-0_scope-0, callerContext={ context=PIG, callerType=PIG_SCRIPT_ID, callerId=PIG-script.pig-e1155078-b7bf-4f84-b9e7-b3f427858f9b } 2016-11-29 15:48:42,244 [PigTezLauncher-0] INFO org.apache.tez.client.TezClient - Submitted dag to TezSession, sessionName=PigLatin:script.pig, applicationId=application_1480244541051_0025, dagName=PigLatin:script.pig-0_scope-0 2016-11-29 15:48:42,512 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl - Timeline service address: http://sandbox.hortonworks.com:8188/ws/v1/timeline/ 2016-11-29 15:48:42,513 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at sandbox.hortonworks.com/172.17.0.2:8050 2016-11-29 15:48:42,514 [PigTezLauncher-0] INFO org.apache.hadoop.yarn.client.AHSProxy - Connecting to Application History server at sandbox.hortonworks.com/172.17.0.2:10200 2016-11-29 15:48:42,531 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - Submitted DAG PigLatin:script.pig-0_scope-0. Application id: application_1480244541051_0025 2016-11-29 15:48:43,083 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - HadoopJobId: job_1480244541051_0025 2016-11-29 15:48:43,537 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 5 Succeeded: 0 Running: 0 Failed: 0 Killed: 0, diagnostics=, counters=null 2016-11-29 15:48:53,253 [PigTezLauncher-0] INFO org.apache.tez.common.counters.Limits - Counter limits initialized with parameters: GROUP_NAME_MAX=256, MAX_GROUPS=3000, COUNTER_NAME_MAX=64, MAX_COUNTERS=10000 2016-11-29 15:48:53,262 [PigTezLauncher-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=SUCCEEDED, progress=TotalTasks: 4 Succeeded: 4 Running: 0 Failed: 0 Killed: 0, diagnostics=, counters=Counters: 174 org.apache.tez.common.counters.DAGCounter NUM_SUCCEEDED_TASKS=4 TOTAL_LAUNCHED_TASKS=4 DATA_LOCAL_TASKS=2 AM_CPU_MILLISECONDS=3370 AM_GC_TIME_MILLIS=31 File System Counters FILE_BYTES_READ=3482 FILE_BYTES_WRITTEN=2396 HDFS_BYTES_READ=26083 HDFS_BYTES_WRITTEN=1532 HDFS_READ_OPS=7 HDFS_WRITE_OPS=2 HDFS_OP_CREATE=1 HDFS_OP_GET_FILE_STATUS=3 HDFS_OP_OPEN=4 HDFS_OP_RENAME=1 org.apache.tez.common.counters.TaskCounter REDUCE_INPUT_GROUPS=298 REDUCE_INPUT_RECORDS=298 COMBINE_INPUT_RECORDS=0 SPILLED_RECORDS=596 NUM_SHUFFLED_INPUTS=5 NUM_SKIPPED_INPUTS=0 NUM_FAILED_SHUFFLE_INPUTS=0 MERGED_MAP_OUTPUTS=5 GC_TIME_MILLIS=285 CPU_MILLISECONDS=5000 PHYSICAL_MEMORY_BYTES=823132160 VIRTUAL_MEMORY_BYTES=3589931008 COMMITTED_HEAP_BYTES=823132160 INPUT_RECORDS_PROCESSED=8100 INPUT_SPLIT_LENGTH_BYTES=52879 OUTPUT_RECORDS=755 OUTPUT_BYTES=7858 OUTPUT_BYTES_WITH_OVERHEAD=4665 OUTPUT_BYTES_PHYSICAL=2252 ADDITIONAL_SPILLS_BYTES_WRITTEN=0 ADDITIONAL_SPILLS_BYTES_READ=2252 ADDITIONAL_SPILL_COUNT=0 SHUFFLE_CHUNK_COUNT=3 SHUFFLE_BYTES=2252 SHUFFLE_BYTES_DECOMPRESSED=4665 SHUFFLE_BYTES_TO_MEM=0 SHUFFLE_BYTES_TO_DISK=0 SHUFFLE_BYTES_DISK_DIRECT=2252 NUM_MEM_TO_DISK_MERGES=0 NUM_DISK_TO_DISK_MERGES=0 SHUFFLE_PHASE_TIME=71 MERGE_PHASE_TIME=104 FIRST_EVENT_RECEIVED=36 LAST_EVENT_RECEIVED=40 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 Shuffle Errors_scope_53_INPUT_scope_52 BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 Shuffle Errors_scope_55_INPUT_scope_53 BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 Shuffle Errors_scope_55_INPUT_scope_54 BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 TaskCounter_scope_52_INPUT_scope_2 INPUT_RECORDS_PROCESSED=8000 INPUT_SPLIT_LENGTH_BYTES=52215 TaskCounter_scope_52_OUTPUT_scope_53 ADDITIONAL_SPILLS_BYTES_READ=0 ADDITIONAL_SPILLS_BYTES_WRITTEN=0 ADDITIONAL_SPILL_COUNT=0 OUTPUT_BYTES=4992 OUTPUT_BYTES_PHYSICAL=613 OUTPUT_BYTES_WITH_OVERHEAD=1377 OUTPUT_RECORDS=457 SHUFFLE_CHUNK_COUNT=1 SPILLED_RECORDS=99 TaskCounter_scope_53_INPUT_scope_52 ADDITIONAL_SPILLS_BYTES_READ=613 ADDITIONAL_SPILLS_BYTES_WRITTEN=0 COMBINE_INPUT_RECORDS=0 FIRST_EVENT_RECEIVED=21 LAST_EVENT_RECEIVED=21 MERGED_MAP_OUTPUTS=1 MERGE_PHASE_TIME=50 NUM_DISK_TO_DISK_MERGES=0 NUM_FAILED_SHUFFLE_INPUTS=0 NUM_MEM_TO_DISK_MERGES=0 NUM_SHUFFLED_INPUTS=1 NUM_SKIPPED_INPUTS=0 REDUCE_INPUT_GROUPS=99 REDUCE_INPUT_RECORDS=99 SHUFFLE_BYTES=613 SHUFFLE_BYTES_DECOMPRESSED=1377 SHUFFLE_BYTES_DISK_DIRECT=613 SHUFFLE_BYTES_TO_DISK=0 SHUFFLE_BYTES_TO_MEM=0 SHUFFLE_PHASE_TIME=31 SPILLED_RECORDS=99 TaskCounter_scope_53_OUTPUT_scope_55 ADDITIONAL_SPILLS_BYTES_READ=0 ADDITIONAL_SPILLS_BYTES_WRITTEN=0 ADDITIONAL_SPILL_COUNT=0 OUTPUT_BYTES=1074 OUTPUT_BYTES_PHYSICAL=683 OUTPUT_BYTES_WITH_OVERHEAD=1284 OUTPUT_RECORDS=99 SHUFFLE_CHUNK_COUNT=1 SPILLED_RECORDS=99 TaskCounter_scope_54_INPUT_scope_24 INPUT_RECORDS_PROCESSED=100 INPUT_SPLIT_LENGTH_BYTES=664 TaskCounter_scope_54_OUTPUT_scope_55 ADDITIONAL_SPILLS_BYTES_READ=0 ADDITIONAL_SPILLS_BYTES_WRITTEN=0 ADDITIONAL_SPILL_COUNT=0 OUTPUT_BYTES=1792 OUTPUT_BYTES_PHYSICAL=956 OUTPUT_BYTES_WITH_OVERHEAD=2004 OUTPUT_RECORDS=100 SHUFFLE_CHUNK_COUNT=1 SPILLED_RECORDS=100 TaskCounter_scope_55_INPUT_scope_53 ADDITIONAL_SPILLS_BYTES_READ=683 ADDITIONAL_SPILLS_BYTES_WRITTEN=0 COMBINE_INPUT_RECORDS=0 FIRST_EVENT_RECEIVED=7 LAST_EVENT_RECEIVED=10 MERGED_MAP_OUTPUTS=2 MERGE_PHASE_TIME=24 NUM_DISK_TO_DISK_MERGES=0 NUM_FAILED_SHUFFLE_INPUTS=0 NUM_MEM_TO_DISK_MERGES=0 NUM_SHUFFLED_INPUTS=2 NUM_SKIPPED_INPUTS=0 REDUCE_INPUT_GROUPS=99 REDUCE_INPUT_RECORDS=99 SHUFFLE_BYTES=683 SHUFFLE_BYTES_DECOMPRESSED=1284 SHUFFLE_BYTES_DISK_DIRECT=683 SHUFFLE_BYTES_TO_DISK=0 SHUFFLE_BYTES_TO_MEM=0 SHUFFLE_PHASE_TIME=19 SPILLED_RECORDS=99 TaskCounter_scope_55_INPUT_scope_54 ADDITIONAL_SPILLS_BYTES_READ=956 ADDITIONAL_SPILLS_BYTES_WRITTEN=0 COMBINE_INPUT_RECORDS=0 FIRST_EVENT_RECEIVED=8 LAST_EVENT_RECEIVED=9 MERGED_MAP_OUTPUTS=2 MERGE_PHASE_TIME=30 NUM_DISK_TO_DISK_MERGES=0 NUM_FAILED_SHUFFLE_INPUTS=0 NUM_MEM_TO_DISK_MERGES=0 NUM_SHUFFLED_INPUTS=2 NUM_SKIPPED_INPUTS=0 REDUCE_INPUT_GROUPS=100 REDUCE_INPUT_RECORDS=100 SHUFFLE_BYTES=956 SHUFFLE_BYTES_DECOMPRESSED=2004 SHUFFLE_BYTES_DISK_DIRECT=956 SHUFFLE_BYTES_TO_DISK=0 SHUFFLE_BYTES_TO_MEM=0 SHUFFLE_PHASE_TIME=21 SPILLED_RECORDS=100 TaskCounter_scope_55_OUTPUT_scope_51 OUTPUT_RECORDS=99 org.apache.hadoop.mapreduce.TaskCounter COMBINE_INPUT_RECORDS=99 COMBINE_OUTPUT_RECORDS=457 org.apache.hadoop.mapreduce.TaskCounter_scope_52_OUTPUT_scope_53 COMBINE_INPUT_RECORDS=99 COMBINE_OUTPUT_RECORDS=457 org.apache.hadoop.mapreduce.TaskCounter_scope_53_INPUT_scope_52 COMBINE_INPUT_RECORDS=0 COMBINE_OUTPUT_RECORDS=0 2016-11-29 15:48:54,128 [main] INFO org.apache.pig.tools.pigstats.tez.TezPigScriptStats - Script Statistics: HadoopVersion: 2.7.3.2.5.0.0-1245 PigVersion: 0.16.0.2.5.0.0-1245 TezVersion: 0.7.0.2.5.0.0-1245 UserId: yarn FileName: script.pig StartedAt: 2016-11-29 15:48:31 FinishedAt: 2016-11-29 15:48:54 Features: HASH_JOIN,GROUP_BY,FILTER Success! DAG 0: Name: PigLatin:script.pig-0_scope-0 ApplicationId: job_1480244541051_0025 TotalLaunchedTasks: 4 FileBytesRead: 3482 FileBytesWritten: 2396 HdfsBytesRead: 26083 HdfsBytesWritten: 1532 SpillableMemoryManager spill count: 0 Bags proactively spilled: 0 Records proactively spilled: 0 DAG Plan: Tez vertex scope-52->Tez vertex scope-53, Tez vertex scope-53->Tez vertex scope-55, Tez vertex scope-54->Tez vertex scope-55, Tez vertex scope-55 Vertex Stats: VertexId Parallelism TotalTasks InputRecords ReduceInputRecords OutputRecords FileBytesRead FileBytesWritten HdfsBytesRead HdfsBytesWritten AliasFeatureOutputs scope-52 1 1 8000 0 457 32 645 24641 0 a,b,c,d,e scope-53 1 1 0 99 99 701 739 0 0 e,hGROUP_BY scope-54 1 1 100 0 100 56 1012 1442 0 g,h scope-55 2 1 0 199 99 2693 0 0 1532 final_data,hHASH_JOINriskfactor, Input(s): Successfully read 100 records (1442 bytes) from: "drivermileage" Successfully read 8000 records (24641 bytes) from: "geolocation" Output(s): Successfully stored 99 records (1532 bytes) in: "riskfactor" 2016-11-29 15:48:54,163 [main] INFO org.apache.pig.Main - Pig script completed in 29 seconds and 51 milliseconds (29051 ms) 2016-11-29 15:48:54,163 [main] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezLauncher - Shutting down thread pool 2016-11-29 15:48:54,183 [pool-1-thread-1] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezSessionManager - Shutting down Tez session org.apache.tez.client.TezClient@df073ec 2016-11-29 15:48:54,206 [pool-1-thread-1] INFO org.apache.tez.client.TezClient - Shutting down Tez Session, sessionName=PigLatin:script.pig, applicationId=application_1480244541051_0025
Any idea? Thanks
Created 11-29-2016 05:04 PM
@Xavier VAN AUSLOOS, where is the issue? I cannot see any errors. Moreover, if you scroll down, you will see this:
Input(s): Successfully read 100 records (1442 bytes) from: "drivermileage" Successfully read 8000 records (24641 bytes) from: "geolocation" Output(s): Successfully stored 99 records (1532 bytes) in: "riskfactor" 2016-11-29 15:48:54,163 [main] INFO org.apache.pig.Main - Pig script completed in 29 seconds and 51 milliseconds (29051 ms)
Check your riskfactor table, you should see your data.
Created 11-30-2016 03:15 PM
Thanks..it worked finally. I do not know why and how...anyway 🙂