About aervits

aervits · ‎03-17-2016

@mike pal please view my example https://github.com/dbist/oozie/tree/master/apps/hcatalog

aervits · ‎03-17-2016

My understanding is that you reference aparticular version of oozie xml languagw for workflows. number 0.5 signifies the latest version as of Oozie 4.2.0. With every new version there are Oozie actions that are being added, deprecated, removed and extended. Meaning let's take fs action, here are notes on it from the docs As of schema 0.4, if a name-node element is specified, then it is not necessary for any of the paths to start with the file system URI as it is taken from the name-node element. This is also true if the name-node is specified in the global section (see Global Configurations ) As of schema 0.4, zero or more job-xml elements can be specified; these must refer to Hadoop JobConf job.xml formatted files bundled in the workflow application. They can be used to set additional properties for the FileSystem instance. As of schema 0.4, if a configuration element is specified, then it will also be used to set additional JobConf properties for the FileSystem instance. Properties specified in theconfiguration element override properties specified in the files specified by any job-xml elements. I would not pay too much attention to schema uro, just try using the latest.

aervits · ‎03-17-2016

I just saw a message it say cannot decide ascii character, can you make sure your FQDN is typed correctly and there are no extra characters present, both in Ambari agent properties, hosts file and web UI

aervits · ‎03-17-2016

In 3 years of writing mapreduce code,i only needed to use generics once. That's not to say it's not useful, it's pretty powerful when you are dealing with objects that have various types. In my case it was a mapreduce program running against HBase, I would extract payload and within it, I had to figure out what was the value type.

aervits · ‎03-17-2016

V can you confirm you setup passwordless ssh to each box? Alternatively, register agents manually http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.1.1/bk_ambari_reference_guide/content/ch_amb_ref_installing_ambari_agents_manually.html Also review all pre-checks in our Ambari user guide

aervits · ‎03-17-2016

have you looked at this page https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput There are a bunch of dependencies you need to include

aervits · ‎03-17-2016

You are using RHEL 7, what is your Python version? We do not support 2.7.9

aervits · ‎03-17-2016

1) yes 100% 2) if you can get away with running mapreduce programs, Hive queues or pig scripts then you should be fine. 3) java is primary language, no .Net support. At least not in HDP on Linux.

aervits · ‎03-16-2016

from Apache pig documentation http://pig.apache.org/docs/r0.15.0/perf.html#replicated-joins Replicated Joins Fragment replicate join is a special type of join that works well if one or more relations are small enough to fit into main memory. In such cases, Pig can perform a very efficient join because all of the hadoop work is done on the map side. In this type of join the large relation is followed by one or more small relations. The small relations must be small enough to fit into main memory; if they don't, the process fails and an error is generated. inner joins and outer joins). In this example, a large relation is joined with two smaller relations. Note that the large relation comes first followed by the smaller relations; and, all small relations together must fit into main memory, otherwise an error is generated. big = LOAD 'big_data' AS (b1,b2,b3); tiny = LOAD 'tiny_data' AS (t1,t2,t3); mini = LOAD 'mini_data' AS (m1,m2,m3); C = JOIN big BY b1, tiny BY t1, mini BY m1 USING 'replicated';

aervits · ‎03-16-2016

You can use any of the tools in this thread to access derby and look at schema https://community.hortonworks.com/questions/15108/how-to-export-data-from-oozie-derby-database.html Keep in mind derby is only for test/dev use.

Online	Offline
Last Visited	‎08-15-2019 06:35 AM

Member Since	‎10-01-2015 11:46 AM
Last Visited	‎08-15-2019 06:35 AM
Posts	3,933
Kudos received	1074

Cloudera Community

Re: Where can I get latest resource_management.c...

Re: How to Kerberize Flume?

Re: Load Hive Table form Pig Output File.

Re: HDP 2.6 Cluster Issues with Hive Metastore

Re: which HDP release will storm 1.1.0 be packaged...

Re: in Sqoop oozie Action hive-site.xml what locat...

Re: What is Schema URI in oozie?

Re: There is a problem when install HDP on the ste...

Re: Use of java generics in hadoop map reduce jobs...

Re: There is a problem when install HDP on the ste...

Re: Mapreduce and HCatalog Integration - HCatOutpu...

Re: There is a problem when install HDP on the ste...

Re: Can I use HortonWorks and Its Technologies to ...

Re: How the Replicated join gives better performan...

Re: oozie derby db