About aervits

aervits · ‎10-26-2015

DISCLAIMER: it was tested on HDP 2.3.2 only. There are two blocking JIRAs preventing usage of blob storage as primary filesystem on HDP 2.3.0. For HBase, you need to use page blob instead of block blob. First things first, install Azure CLI for Mac or use Azure portal. The steps below are for CLI. azure login enter username enter password azure storage account create storageaccountname --type LRS azure storage account keys list storageaccountname note the account keys, you will need them in the next step azure storage container create storagecontainername --account-name storageaccountname --account-key accountkeystring just to validate it was created azure storage blob list storagecontainernae --account-name storageaccountname --account-key Once the previous steps have been completed, go to Ambari UI and edit the core-site.xml In addition to these properties, you need to replace fs.defaultFS property with the wasb path. These properties and their descriptions are discussed in hadoop-azure documentation. If you choose to install HBase you also need to edit hbase-site.xml and modify hbase.rootDir property. Now restart the cluster for changes to take effect and start using the cluster. For HBase, there are some open JIRAs and your usage may vary. I encountered the following error when I tried to pre-split and drop/create the same table over and over. The fix is coming in Hadoop 2.8 so until then, beware of acquired lease messages on HBase.

aervits · ‎10-27-2015

http://community.hortonworks.com/articles/2143/use-wasb-as-hdp-232-filesystem.html I used page blob as per the hadoop-azure doc

aervits · ‎02-02-2016

@Andrew Watson has this been resolved? Can you accept best answer or provide your own solution?

hoanghung209 · ‎08-27-2018

Hi @Artem Ervits, can you reload your solution shell? Tks.

nsabharwal · ‎10-25-2015

@bsaini@hortonworks.com Node 1 - no storage - I guess you wont be using this for data node but it can play role of worker node (Nodemanager if you like and other components) You will be creating a config group for node1 Please see this screenshot. In my case, node4 is datanode and I want to customize the data directories Created new config for node4 select the config and change parameters that you want to change. There is Doc example

aervits · ‎02-02-2016

@Benjamin Leonhardi 🙂

aervits · ‎02-02-2016

The answer is the logic to group puts by regionserver is now built-in with HBase API 1.0+. It is no longer necessary to leverage any other code to achieve it.

aervits · ‎10-23-2015

SOLUTION Pig introduced an option to run hcatalog commands in grunt and in scripts. There's more info in pig.properties file in /etc/pig/conf/pig.properties # In addition to the fs-style commands (rm, ls, etc) Pig can now execute # SQL-style DDL commands, eg "sql create table pig_test(name string, age int)". # The only implemented backend is hcat, and luckily that's also the default. # # pig.sql.type=hcat # Path to the hcat executable, for use with pig.sql.type=hcat (default: null) # hcat.bin=/usr/local/hcat/bin/hcat this is on sandbox 2.3, HDP 2.3 and HDP 2.3.2 by default so running any pig script with hcat commands without -useHCatalog will fail with the following, usually that will happen through Oozie Pig Stack Trace --------------- ERROR 2997: Encountered IOException. /usr/local/hcat/bin/hcat does not exist. Please check your 'hcat.bin' setting in pig.properties. java.io.IOException: /usr/local/hcat/bin/hcat does not exist. Please check your 'hcat.bin' setting in pig.properties. at org.apache.pig.tools.grunt.GruntParser.processSQLCommand(GruntParser.java:1286) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:501) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) at org.apache.pig.Main.run(Main.java:631) at org.apache.pig.Main.main(Main.java:177) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) so to fix it change it globally by editing pig.properties in Ambari to point to the following: hcat.bin=/usr/bin/hcat or copy pig.properties to your own location, override the pig.properties with right path (i.e. hcat.bin=/usr/bin/hcat) and execute script like so pig -P pig.properties test.pig or override the property on the fly pig -Dhcat.bin=/usr/bin/hcat test.pig Or even lesser intrusive way: In your pig script put this in the beginning set hcat.bin /usr/bin/hcat;

Former Member · ‎08-17-2017

it's really helpful, thanks -------------------- https://moshimoshi.vn/

jsirota · ‎10-26-2015

Hi Artem, Are you trying to implement OpenSOC? Ryan is correct. The default stream id is "default"

Online	Offline
Last Visited	‎08-15-2019 06:35 AM

Member Since	‎10-01-2015 11:46 AM
Last Visited	‎08-15-2019 06:35 AM
Posts	3,933
Kudos received	1074

Cloudera Community

Re: Where can I get latest resource_management.c...

Re: How to Kerberize Flume?

Re: Load Hive Table form Pig Output File.

Re: HDP 2.6 Cluster Issues with Hive Metastore

Re: which HDP release will storm 1.1.0 be packaged...

Use WASB as HDP 2.3.2 File System

Re: HBase using WASB as rootDir

Re: ORC vs Parquet - When to use one over the othe...

Re: when running sqoop through shell script in Ooz...

Re: Heterogeneous worker nodes with HDP2.3

Re: SPNEGO access for oozie/timelineserver ui with...

Re: Status of Grouping Puts by RegionServer in HBa...

Re: hcat.bin is not defined. Define it to be your ...

Re: Measuring HDP Performance, Scale and Reliabili...

Re: Implementing custom grouping on Storm throws a...