Member since
03-16-2016
707
Posts
1753
Kudos Received
203
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4997 | 09-21-2018 09:54 PM | |
6329 | 03-31-2018 03:59 AM | |
1932 | 03-31-2018 03:55 AM | |
2142 | 03-31-2018 03:31 AM | |
4735 | 03-27-2018 03:46 PM |
10-06-2017
07:20 PM
6 Kudos
Introduction This is a continuation of an article I wrote about 1 year ago: https://community.hortonworks.com/articles/60580/jmeter-setup-for-hive-load-testing-draft.htmlhttps://www.blazemeter.com/blog/windows-authentication-apache-jmeter Steps 1) Enable Kerberos on your cluster Perform all steps specified here: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_security/content/configuring_amb_hdp_for_kerberos.html and connect successfully to hive service via command line using your user keytab. That implies a valid ticket. 2) Install JMeter See previous article mentioned in Introduction. 3) Set Hive User keytab in jaas.conf JMETER_HOME/bin/jaas.conf Your jaas.conf should look something like this: JMeter {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=false
doNotPrompt=true
useKeyTab=true
keyTab="/etc/security/keytabs/hive.service.keytab"
principal="hive/server.example.com@EXAMPLE.COM"
debug=true;
}; 4) JMeter Setup There are 2 files under /bin folder of the JMeter installation which are used for Kerberos configuration: krb5.conf - file of .ini format which contains Kerberos configuration details jaas.conf - file which holds configuration details of Java Authentication and Authorization service These files aren’t being used by default, so you have to tell JMeter where they are via system properties such as: -Djava.security.krb5.conf=krb5.conf
-Djava.security.auth.login.config=jaas.conf Alternatively you can add the next two lines to the system.properties file which is located at the same /bin folder. java.security.krb5.conf=krb5.conf
java.security.auth.login.config=jaas.conf I suggest using full paths to files. 5) Manage Issues If you encounter any issues: - enable debug by adding the following to your command: -Dsun.security.krb5.debug=true
-Djava.security.debug=gssloginconfig,configfile,configparser,logincontext - check jmeter.log to see whether all properties are set as expected and map to existent file paths. 6) Turn-off Subject Credentials -Djavax.security.auth.useSubjectCredsOnly=false 7) Example of JMeter Command JVM_ARGS="-Xms1024m
-Xmx1024m" bin/jmeter -Dsun.security.krb5.debug=true
-Djavax.security.auth.useSubjectCredsOnly=false
-Djava.security.debug=gssloginconfig,configfile,configparser,logincontext
-Djava.security.krb5.conf=/path/to/krb5.conf
-Djava.security.auth.login.config=/path/to/jaas.conf -n -t t1.jmx -l results -e
-o output This could be simplified if you add those two lines mentioned earlier to be added to system.properties.
... View more
Labels:
10-06-2017
06:36 PM
5 Kudos
@KUMAR PEDDIBHOTLA a) I assume that you use a hive user keytab in jaas.conf and you tested hive access from command line successfully. That is the must do first step. Make sure that you have a valid ticket (kinit, klist etc.). See documentation on docs.hortonworks.com for Hive and Kerberos. b) There are 2 files under /bin folder of the JMeter installation which are used for Kerberos configuration: krb5.conf - file of .ini format which contains Kerberos configuration details jaas.conf - file which holds configuration details of Java Authentication and Authorization service These files aren’t being used by default, so you have to tell JMeter where they are via system properties such as: -Djava.security.krb5.conf=krb5.conf -Djava.security.auth.login.config=jaas.conf Alternatively you can add the next two lines to the system.properties file which is located at the same /bin folder. java.security.krb5.conf=krb5.conf java.security.auth.login.config=jaas.conf I suggest using full paths to files. c) Enable debug by adding the following to your command -Djava.security.debug=gssloginconfig,configfile,configparser,logincontext d) Check jmeter.log to see whether all properties are set as expected and map to existent file paths. e) Turn off subject credentials: -Djavax.security.auth.useSubjectCredsOnly=false
... View more
10-06-2017
06:21 PM
@uday kv See this new article: https://community.hortonworks.com/articles/141035/jmeter-kerberos-setup-for-hive-load-testing.html
... View more
09-18-2017
01:51 AM
3 Kudos
https://community.hortonworks.com/content/supportkb/49683/comparing-space-usage-from-hdfs-dfsamin-report-and.html There are many others in our forum.
... View more
09-18-2017
01:49 AM
3 Kudos
@Punit kumar Let's assume your HDFS had available a single HDFS block size of 128 MB. If you write 1MB file to that block, that block is not available for another write, as such while your used space is less than 1% practically , but your space available is 0 block and 0 bytes for new write. I hope you get the difference. That happens when you deal with large blocks. Smaller than block size files can lead to waste of space. Instead of using df to show bytes available or used, you should look for blocks used and available and eventually multiply that with block size. I responded to a similar question last year. Let me find it.
... View more
08-28-2017
02:23 PM
4 Kudos
@Bhuban Sahoo PerformanceEvaluation The PerformanceEvaluation utility allows you to run several preconfigured tests on your cluster and reports its performance. To run the PerformanceEvaluation tool, use the command: bin/hbase pe In older HDP versions, use the command: bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation For usage instructions, run the command with no arguments. LoadTestTool The LoadTestTool utility load-tests your cluster by performing writes, updates, or reads on it. To run the LoadTestTool use the command: bin/hbase ltt In older HDP versions, use the command: <strong>bin/hbase org.apache.hadoop.hbase.util.LoadTestTool</strong> To print general usage information, use the -h option. Alternative to LTT is YCSB
... View more
08-01-2017
01:31 AM
2 Kudos
@mungeol heo Let's clarify a little bit. Do you run two HiveServer2 instances to manage different workloads? If so, it is not a surprise to have different settings for log4j.
... View more
08-01-2017
01:27 AM
2 Kudos
@David Bon-Salomon Most of the questions go through immediately. Questions that are suspect of spam are sent to moderation automatically. Questions that are possibly duplicate go to moderation manually. Questions that do not respect the code of conduct go to moderation. If none of the above and the question seems to be appropriate, it is possible the spam filter may have caught it by mistake. It is not perfect. It will take probably hours or days to be reviewed and sent back to the forum.
... View more
07-26-2017
05:47 PM
2 Kudos
Thanks. Setting doAs=true should do the trick.
... View more
07-26-2017
05:34 PM
10 Kudos
Introduction GeoMesa is an Apache licensed open source suite of tools that
enables large-scale geospatial analytics on cloud and distributed computing
systems, letting you manage and analyze the huge spatio-temporal datasets that
IoT, social media, tracking, and mobile phone applications seek to take
advantage of today. GeoMesa does this by providing spatio-temporal data persistence
on top of the Accumulo, HBase, and Cassandra distributed column-oriented
databases for massive storage of point, line, and polygon data. It allows rapid
access to this data via queries that take full advantage of geographical
properties to specify distance and area. GeoMesa also provides support for near
real time stream processing of spatio-temporal data by layering spatial
semantics on top of the Apache Kafka messaging system. GeoMesa features include the ability to: Store
gigabytes to petabytes of spatial data (tens of billions of points or more) Serve
up tens of millions of points in seconds Ingest
data faster than 10,000 records per second per node Scale
horizontally easily (add more servers to add more capacity) Support
Spark analytics Drive
a map through GeoServer or other OGC Clients Installation GeoMesa supports traditional HBase installations as well as
HBase running on Amazon’s EMR and Hortonworks’ Data
Platform (HDP).
For instructions on bootstrapping an EMR cluster, please read this tutorial: Bootstrapping GeoMesa HBase on AWS S3. Tutorial Overview The
code in this tutorial only does a few things:
Establishes a new
(static) SimpleFeatureType Prepares the HBase
table to store this type of data Creates 1000 example
SimpleFeatures Writes these
SimpleFeatures to the HBase table Queries for a given
geographic rectangle and time range, and attribute filter, and writes out
the entries in the result set Prerequisites
Java Development
Kit 1.8, Apache Maven, a
GitHub client, HBase
1.2.x (optional), and GeoServer
2.9.1 (optional). An existing HBase 1.1.x
installation is helpful but not necessary. The tutorial described will work
either with an existing HBase server or by downloading the HBase binary
distribution and running it in "standalone" mode (described below). GeoServer is only
required for visualizing the HBase data. Setting up GeoServer is beyond the
scope of this tutorial. Download and Build the Tutorial Clone the
geomesa-tutorials distribution from GitHub: $ git clone https://github.com/geomesa/geomesa-tutorials.git
$ cd geomesa-tutorials The pom.xml file contains an explicit list of dependent libraries that
will be bundled together into the final tutorial. You should confirm that the
versions of HBase and Hadoop match what you are running; if it does not match,
change the values of the hbase.version and hbase.hadoop.version properties. The version of GeoMesa that this tutorial
targets matches the project version of the pom.xml . (Note that this tutorial has been
tested with GeoMesa 1.2.2 or later). The only
reason these libraries are bundled into the final JAR is that this is easier
for most people to do this than it is to set the classpath when running the
tutorial. If you would rather not bundle these dependencies, mark them as provided in the POM, and update your classpath as appropriate. GeoMesa's HBaseDataStore searches for a file called hbase-site.xml , which among other things configures the Zookeeper host(s) and
port. If this file is not present on the classpath, the hbase-default.xml provided by hbase-common sets the default zookeeper quorum
to "localhost" and port to 2181, which is what is used by the
standalone HBase described in "Setting up HBase in standalone mode"
above. If you have an existing HBase installation, you should copy your hbase-site.xml file into geomesa-quickstart-hbase/src/main/resources (or otherwise add it to the
classpath when you run the tutorial). To build the tutorial
code: $ cd geomesa-quickstart-hbase
$ mvn clean install When this is complete,
it should have built a JAR file that contains all of the code you need to run
the tutorial. Run the Tutorial First, make sure that hbase.table.sanity.check
property is set to false in hbase-site.xml On the command line,
run: $ java -cp target/geomesa-quickstart-hbase-$VERSION.jar com.example.geomesa.hbase.HBaseQuickStart --bigtable_table_name geomesa
The only argument passed
is the name of the HBase table where GeoMesa will store the feature type
information. It will also create a table called <tablename>_<featuretype>_z3 which will store the
Z3-indexed features. In our specific case, the table name will be geomesa_QuickStart_z3. You should see output
similar to the following (not including some of Maven's output and log4j's
warnings), which lists the features that match the specified query in the
tutorial do Creating feature-type (schema): QuickStart
Creating new features
Inserting new features
Submitting query
1. Bierce|676|Fri Jul 18 08:22:03 EDT 2014|POINT (-78.08495724535888 37.590866849120395)|null
2. Bierce|190|Sat Jul 26 19:06:19 EDT 2014|POINT (-78.1159944062711 37.64226959044015)|null
3. Bierce|550|Mon Aug 04 08:27:52 EDT 2014|POINT (-78.01884511971093 37.68814732634964)|null
4. Bierce|307|Tue Sep 09 11:23:22 EDT 2014|POINT (-78.18782181976381 37.6444865782879)|null
5. Bierce|781|Wed Sep 10 01:14:16 EDT 2014|POINT (-78.0250604717695 37.58285696304815)|null
To see how the data is
stored in HBase, use the HBase shell. $ /path/to/hbase-1.2.3/bin/hbase shell The type information is
in the geomesa table (or whatever name you specified on the command
line): hbase> scan 'geomesa'
ROW COLUMN+CELL
QuickStart column=M:schema, timestamp=1463593804724, value=Who:String,What:Long,When:Date,*Where:Point:s
rid=4326,Why:String
The features are stored
in <tablename>_<featuretype>_z3 ( geomesa_QuickStart_z3 in this example): hbase> scan 'geomesa_QuickStart_z3', { LIMIT => 3 }
ROW COLUMN+CELL \x08\xF7\x0F#\x83\x91\xAE\xA2\x column=D:\x0F#\x83\x91\xAE\xA2\xA8PObservation.452, timestamp=1463593805801, value=\x02\x00\x A8P 00\x00@Observation.45\xB2Clemen\xF3\x01\x00\x00\x00\x00\x00\x00\x01\xC4\x01\x00\x00\x01CM8\x0 E\xA0\x01\x01\xC0S!\x93\xBCSg\x00\xC0CG\xBF$\x0DO\x7F\x80\x14\x1B$-? \x08\xF8\x06\x03\x19\xDFf\xA3p\ column=D:\x06\x03\x19\xDFf\xA3p\x0CObservation.362, timestamp=1463593805680, value=\x02\x00\x x0C 00\x00@Observation.36\xB2Clemen\xF3\x01\x00\x00\x00\x00\x00\x00\x01j\x01\x00\x00\x01CQ\x17wh\ x01\x01\xC0S\x05\xA5b\xD49"\xC0B\x88*~\xD1\xA0}\x80\x14\x1B$-? \x08\xF8\x06\x07\x19S\xD0\xA21> column=D:\x06\x07\x19S\xD0\xA21>Observation.35, timestamp=1463593805664, value=\x02\x00\x00\x 00?Observation.3\xB5Clemen\xF3\x01\x00\x00\x00\x00\x00\x00\x00#\x01\x00\x00\x01CS?`x\x01\x01\
xC0S_\xA7+G\xADH\xC0B\x90\xEB\xF7`\xC2T\x80\x13\x1A#,>
Recommendations There are many reasons that GeoMesa can provide the best solution to your spatio-temporal database needs:
You have Big Spatial Data sets and are reaching performance limitations of relational database systems. Perhaps you are looking at sharding strategies and wondering if now is the time to look for a new storage solution. You have very high-velocity data and need high read and write speeds. Your analytics operate in the cloud, perhaps using Spark, and you want to enable spatial analytics. You are looking for a supported, open-source alternative to expensive proprietary solutions. You are looking for a Platform as a Service (PaaS) database where you can store Big Spatial Data. You want to filter data using the rich Common Query Language (CQL) defined by the OGC. Reference: https://github.com/geomesa/geomesa-tutorials/tree/master/geomesa-quickstart-hbase
... View more