Member since
02-21-2016
30
Posts
26
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1778 | 08-24-2016 09:30 PM | |
7661 | 08-22-2016 06:13 AM | |
1934 | 08-10-2016 09:45 AM | |
2023 | 07-29-2016 05:14 PM |
02-28-2017
02:25 PM
1 Kudo
Caveat: This feature has been validated by HWX engineer manually but we don't officially support it at the moment. Environment:
HDP-2.5.3.0-37 Ambari-2.4.2.0-136 JDK1.8 Kerberos enabled Ranger enabled Due to security limitations, we can only launch flume agent processes in Ambari. STEP 1: Create/modify flume configuration file. Ambari -> Flume -> Configs -> flume.conf # Flume agent config
#### Global ####
demo.sources = logtcp logudp
demo.channels = kafka_channel
demo.sinks = sink
#### Sources ####
demo.sources.logtcp.type = multiport_syslogtcp
demo.sources.logtcp.ports = 9515
demo.sources.logtcp.host = 0.0.0.0
demo.sources.logtcp.keepFields = true
demo.sources.logtcp.selector.type=replicating
demo.sources.logtcp.channels= kafka_channel
demo.sources.logudp.type = syslogudp
demo.sources.logudp.port = 9515
demo.sources.logudp.host = 0.0.0.0
demo.sources.logudp.keepFields = true
demo.sources.logudp.selector.type=replicating
demo.sources.logudp.channels = kafka_channel
#### Sinks ####
demo.sinks.sink.type = logger
demo.sinks.sink.channel = kafka_channel
#### Channels ####
demo.channels.kafka_channel.type = org.apache.flume.channel.kafka.KafkaChannel
demo.channels.kafka_channel.kafka.bootstrap.servers = node1.vxu.com:6667,node2.vxu.com:6667,node3.vxu.com:6667
demo.channels.kafka_channel.kafka.topic = flume_topic
demo.channels.kafka_channel.kafka.producer.security.protocol = SASL_PLAINTEXT
demo.channels.kafka_channel.kafka.producer.sasl.mechanism = GSSAPI
demo.channels.kafka_channel.kafka.consumer.security.protocol = SASL_PLAINTEXT
demo.channels.kafka_channel.kafka.consumer.sasl.mechanism = GSSAPI STEP 2: Add kafka jaas file(s) Create a flume_kafka_jaas.conf in /etc/flume/conf/: KafkaClient {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=true
storeKey=true
serviceName="kafka"
keyTab="/etc/security/keytabs/kafka.service.keytab"
principal="kafka/node1.vxu.com@VXU.COM";
};
STEP 3: Modify flume-env template Ambari -> Flume -> Configs -> Advanced flume-env -> flume-env template ...
# Enviroment variables can be set here.
export JAVA_HOME={{java_home}}
# Give Flume more memory and pre-allocate, enable remote monitoring via JMX
export JAVA_OPTS="-Xms100m -Xmx2000m -Dcom.sun.management.jmxremote -Dflume.monitoring.type=http -Dflume.monitoring.port=34545 -Djava.security.auth.login.config=/etc/flume/conf/flume_kafka_jaas.conf"
# Note that the Flume conf directory is always included in the classpath.
# Add flume sink to classpath
if [ -e "/usr/lib/flume/lib/ambari-metrics-flume-sink.jar" ]; then
export FLUME_CLASSPATH=$FLUME_CLASSPATH:/usr/lib/flume/lib/ambari-metrics-flume-sink.jar
fi
export HIVE_HOME={{flume_hive_home}}
export HCAT_HOME={{flume_hcat_home}} Note: After changing the flume configs, you need to clear the /etc/flume/conf/demo directory and kill all previous flume agent processes. Otherwise, new configs may not take effect.
... View more
Labels:
08-24-2016
09:30 PM
3 Kudos
@Brandon Wilson mqureshi's explanation is correct, technically, you can have unlimited number of snapshots in hbase, but it will put much pressure on hdfs. It would not only just occupy some disk space, but it would create a huge amount of hfiles that might slow down the NameNode. Let's assume that you have a 10-CF HTable with 50k regions, each CF has 5 hfiles in average, which means you would have totally 2.5million hfiles for this single table. The first time you create a snapshot, all 2.5m hfiles will be referenced. When you do another snapshot in the next day(after some routine compactions, of course), another 2 or more million new hfiles will probably be referenced. Remember: old hfiles would not be removed until the snapshot is removed. In this case, you will get more than 15 million referenced hfiles after a week, which would be a really bad news for namenode.
... View more
08-23-2016
09:34 AM
2 Kudos
Hi @Raja Ray, To answer your questions: 1. If I put data in temporary hbase cluster during main hbase cluster downtime, then how I will merge data from temporary cluster to main cluster when main cluster will be up and running. If there are only Put operations during the main cluster downtime, you can use CopyTable tool or Export& Bulkload tool to migrate data from temporary cluster back to main cluster after it's up. But if there are both Put and Delete operations during the main cluster downtime, the best way to migrate data is to set up hbase replication from temporary cluster to main cluster. This will read all WALs(Write-ahead-log) and replay both Puts and Deletes on the main cluster after it's up. 2. When I am restoring data from hdfs hfile location to new location, then how I will recover memstore data. Memstore is a place in RS to keep incoming data. It will start growing when new write operations are coming. If you mean the blockcache of the hfile, that will be reload into memory when new read operations are coming. 3. If I shutdown restart hbase service, is memstore data being flushed to hdfs hfile that time? Yes, memstore would be forced to flush to hfile before RS is shutdown. Make sure hdfs path '/apps/hbase/data/WALs/' is empty after hbase being shutdown, so that all memstore data has been flushed into hfiles. Thanks, Victor
... View more
08-22-2016
03:46 PM
Hi @Raja Ray, I checked but HBase rolling upgrade won't help here either, because HMaster and RS both use this 'hbase.rootdir' in the runtime and only changing part of them would cause data inconsistencies. So my suggestion would be create a smaller temporary hbase cluster to handle all the production requests and do a quick restart on the main hbase cluster. Modifying 'hbase.rootdir' really needs downtime. Hope that will help. Thanks, Victor
... View more
08-22-2016
03:31 PM
In other words, there's no 'hot switch' for this 'hbase.rootdir' parameter. If you want to change it, you have to restart hbase to make it work.
... View more
08-22-2016
03:27 PM
Ok, I understand. But even if you just want to change hdfs root directory for a running hbase cluster, you'll need a restart to make it work. Do you mean you've already change the root path to '/apps/hbase/data2' before starting your current hbase cluster?
... View more
08-22-2016
03:10 PM
Hi @Raja Ray, 1. Which version of hbase are you using? 2. When performing my steps, is there any specific error log that you can share with me? 3. Could you elaborate on your use case? Thanks, Victor
... View more
08-22-2016
06:13 AM
2 Kudos
Hi @Raja Ray, here are the steps for recover Hfiles in another hdfs directory: 1. Shutdown the hbase with old hdfs path. 2. Change 'hbase.rootdir' to new path and restart hbase. 3. Create table 'CUTOFF2', so that new htable structure will be created in new hdfs path, and of course, it's empty. 4. Use distcp to copy hfile(s) from old path to new path in case the hfile(s) are very huge. 5. Do a 'hbase hbck' on the new hbase, and there should be something wrong with the 'CUTOFF2'. 6. Do a 'hbase hbck -repair' on the problematic table and it will finalize the recovery. 7. Done
... View more
08-10-2016
09:45 AM
2 Kudos
Hi @pan bocun, I guess you want to start REST for hbase: Use one of the following commands to start the REST server in the foreground or background. The port is optional, and defaults to 8080.
# Foreground
$ bin/hbase rest start -p <port>
# Background, logging to a file in $HBASE_LOGS_DIR
$ bin/hbase-daemon.sh start rest -p <port>
Reference: http://hbase.apache.org/book.html#_rest
... View more
07-29-2016
05:14 PM
Hi @Sunile Manjee, all the configurations of AMS hbase are in /etc/ams-hbase/conf/ on ambari metrics collector node. You can also view and change that in Ambari Metrics -> Configs -> 'Advanced ams-hbase-env' / 'Advanced ams-hbase-site'. The best way to check AMS metadata info is to execute the following line on ambari metrics collector node: # hbase --config /etc/ams-hbase/conf/ shell Then use 'list', 'desc' or other useful commands to view the metadata you need.
... View more