Member since
10-01-2015
3933
Posts
1150
Kudos Received
374
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3365 | 05-03-2017 05:13 PM | |
2797 | 05-02-2017 08:38 AM | |
3076 | 05-02-2017 08:13 AM | |
3006 | 04-10-2017 10:51 PM | |
1518 | 03-28-2017 02:27 AM |
01-04-2017
12:55 AM
1 Kudo
I am mobile and can't comment on your workflows right now but I have example of python2 and python3 WF in my repo https://github.com/dbist/oozie Browse to oozie/apps/ and you will see their respective directories. Use as you wish.
... View more
12-29-2016
09:00 PM
@Praveen PentaReddy to close the loop on this, turns out append is the default behavior and if you read the comments in https://issues.apache.org/jira/browse/HIVE-6897 you can see that it is not advisable to force an overwrite of a table via HCatalog. So to turn the feature off completely and not promote "bad" behavior I did the following grunt> sql alter table codeZ set TBLPROPERTIES ('immutable' = 'true');
2016-12-29 20:49:51,924 [main] INFO org.apache.pig.tools.grunt.GruntParser - Going to run hcat command: alter table codeZ set TBLPROPERTIES ('immutable' = 'true');
OK
Time taken: 2.041 seconds
grunt> a = load 'sample_07' using org.apache.hive.hcatalog.pig.HCatLoader();
2016-12-29 20:51:00,125 [main] INFO hive.metastore - Trying to connect to metastore with URI thrift://sandbox.hortonworks.com:9083
2016-12-29 20:51:00,163 [main] INFO hive.metastore - Connected to metastore.
grunt> b = load 'sample_08' using org.apache.hive.hcatalog.pig.HCatLoader();
grunt> c = join b by code, a by code;
grunt> d = foreach c generate $0 as code, $1 as description, $2 as total_emp, $3 as salary;
grunt> store d into 'codeZ' using org.apache.hive.hcatalog.pig.HCatStorer();
2016-12-29 20:52:26,894 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 6000:
<line 5, column 0> Output Location Validation Failed for: 'codeZ More info to follow:
org.apache.hive.hcatalog.common.HCatException : 2003 : Non-partitioned table already contains data : default.codez
Details at logfile: /home/guest/pig_1483044536026.log
grunt> quit
2016-12-29 20:56:25,336 [main] INFO org.apache.pig.Main - Pig script completed in 7 minutes, 29 seconds and 397 milliseconds (449397 ms)
[guest@sandbox ~]$ less /home/guest/pig_1483044536026.log
and snippet from log ERROR 6000:
<line 5, column 0> Output Location Validation Failed for: 'codeZ More info to follow:
org.apache.hive.hcatalog.common.HCatException : 2003 : Non-partitioned table already contains data : default.codez
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store alias d
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1778)
at org.apache.pig.PigServer.registerQuery(PigServer.java:707)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:1075)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:505)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:231)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:206)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:566)
at org.apache.pig.Main.main(Main.java:178)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: org.apache.pig.impl.plan.VisitorException: ERROR 6000:
<line 5, column 0> Output Location Validation Failed for: 'codeZ More info to follow:
org.apache.hive.hcatalog.common.HCatException : 2003 : Non-partitioned table already contains data : default.codez
at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:95)
at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66)
at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.newplan.logical.relational.LogicalPlan.validate(LogicalPlan.java:212)
at org.apache.pig.PigServer$Graph.compile(PigServer.java:1851)
at org.apache.pig.PigServer$Graph.access$300(PigServer.java:1527)
at org.apache.pig.PigServer.execute(PigServer.java:1440)
at org.apache.pig.PigServer.access$500(PigServer.java:118)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1773)
... 14 more
Caused by: org.apache.hive.hcatalog.common.HCatException : 2003 : Non-partitioned table already contains data : default.codez
at org.apache.hive.hcatalog.mapreduce.FileOutputFormatContainer.handleDuplicatePublish(FileOutputFormatContainer.java:206)
at org.apache.hive.hcatalog.mapreduce.FileOutputFormatContainer.checkOutputSpecs(FileOutputFormatContainer.java:121)
at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:65)
at org.apache.pig.newplan.logical.visitor.InputOutputFileValidatorVisitor.visit(InputOutputFileValidatorVisitor.java:69)
in other words, it is not advisable to overwrite via HCatStorer as Hive handles append/overwrite. The only workaround here is to use a temporary table as I suggested earlier.
... View more
12-29-2016
05:23 PM
@sudarshan kumar did that answer your question?
... View more
12-29-2016
02:42 PM
@rathna mohan I have not tested this but it may be possible to use HBase API to achieve what you're asking, https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnPaginationFilter.html. I am just guessing and I don't think it will be a trivial effort as you're going to have to access Phoenix from HBase API. Again, I hesitate to suggest this route as effort to develop this functionality can be quite involved.
... View more
12-28-2016
01:49 PM
Offset functionality is available in Phoenix as of 4.8, there are no plans to backport to to 4.4. What is limiting you from upgrading to the latest?
... View more
12-26-2016
03:12 PM
capacity issues indicate you may be sending too much data or your destination cannot keep up with that much data. That's one of the bigger reasons Apache Nifi is a superior tool as it has backpressure built in as well as visual guides that you may have some kind of capacity issues. Try to throttle down how much data you're sending to the memory channel or see if you can expire stale data. Oh yeah, in Nifi you can expire data too.
... View more
12-26-2016
02:13 PM
2 Kudos
HDC 1. HDC is backed by S3, that's the primary way to offer HA. There are no options to deploy HA namenode, RS, etc as clusters are meant to be terminated once used. 2. Plans for enterprise support will be announced in Q1. Cloudbreak 1. Yes, you can set up alerts to trigger upscale or downscale based on utilization. 2. Yes you can deploy HA across all HA-compatible components. 3. I'd say Cloudbreak makes things easier overall but one could argue you lose some control vs. manual install. Also there might be a short learning curve to learn Cloudbreak but everything is relative. On the flip side, if you learn Cloudbreak once, you can now have a choice to deploy identical infrastructure on another cloud provider. It's all about choice. I recommend you reach out to a local hwx representative who can pull the right team of experts to work through your pains and adivse a solution.
... View more
12-25-2016
01:00 PM
3 Kudos
Thwre are no options to install new services with HDC. It servers only prescriptive use cases and it is not meant to be extended with additional services. For long running clusters in AWS with configurable components and options, please consider using Cloudbreak, HDC was built on Cloudbreak technology. Finally, you can also stand up HDP with AWS by launching AMI and do a manual or blueprint install, this option also allowed for all HDP components we offer.
... View more
12-23-2016
01:33 AM
Was I able to answer your question or do you need further clarification?
... View more
12-22-2016
04:20 PM
@Anbu Eswaran please select the best answer as we don't know which answer helped most.
... View more