Member since
05-24-2018
25
Posts
7
Kudos Received
0
Solutions
05-02-2019
08:58 PM
Short description: In this article I am going to create a simple Producer to publish messages(tweets) to a kafka topic. Additionally I'm also creating a simple Consumer that subscribes to the kafka topic and reads the messages Create the kafka topic: ./kafka-topics.sh --create --topic 'kafka-tweets' --partitions 3 --replication-factor 3 --zookeeper <zookeeper node:zk port> Install necessary packages in your python project venv: pip install kafka-python twython Producer: def main():
# Load credentials from json file
with open("twitter_credentials.json", "r") as file:
creds = json.load(file)
# Instantiate
python_tweets = Twython(creds['CONSUMER_KEY'], creds['CONSUMER_SECRET'])
# search query
query = {'q': 'cloudera', 'result_type': 'mixed', 'count': 100}
#result is a python dict of tweets
result = python_tweets.search(**query)['statuses']
injest_data(result) To get access to twitter API I need to use my credentials which are stored in "twitter_credentials.json". I then use twython to search for 100 tweets that contain word "cloudera" The result is a python dict, that will be the input of injest_data() were I will be connecting to kafka and then send messages to topic "kafka-tweets" def injest_data(list):
#serialize dict to string via json and encode to bytes via utf-8
p = KafkaProducer(bootstrap_servers='<kafka-broker>:6667', acks='all',value_serializer=lambda m: json.dumps(m).encode('utf-8'), batch_size=1024)
for item in list:
p.send('kafka-tweets', value=item)
p.flush(100)
p.close() Consumer: def consume():
# To consume latest messages and auto-commit offsets and also decode from raw bytes to utf-8
consumer = KafkaConsumer('kafka-tweets',
bootstrap_servers=['<kafka-broker>:6667'],value_deserializer=lambda m: json.loads(m.decode('utf-8')),consumer_timeout_ms=10000)
for message in consumer:
# message value and key are raw bytes -- need to decode
print ("%s:%d:%d: key=%s value=%s" % (message.topic, message.partition,
message.offset, message.key,
message.value))
consumer.close() We are subscribing to "kafka-tweets" topic and then reading the messages Output (1 message): tweets:0:484: key=None value={u'contributors': None, u'truncated': True, u'text': u'Urgent Requirement for an Infrastructure & Platform Engineer to work with one of our top financial clients!!\nApply\u2026 https://t.co/3pGFbOASGj', u'is_quote_status': False, u'in_reply_to_status_id': None, u'id': 1124041974875664390, u'favorite_count': 1, u'source': u'<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', u'retweeted': False, u'coordinates': None, u'entities': {u'symbols': [], u'user_mentions': [], u'hashtags': [], u'urls': [{u'url': u'https://t.co/3pGFbOASGj', u'indices': [120, 143], u'expanded_url': u'https://twitter.com/i/web/status/1124041974875664390', u'display_url': u'twitter.com/i/web/status/1\u2026'}]}, u'in_reply_to_screen_name': None, u'id_str': u'1124041974875664390', u'retweet_count': 5, u'in_reply_to_user_id': None, u'favorited': False, u'user': {u'follow_request_sent': None, u'has_extended_profile': False, u'profile_use_background_image': False, u'time_zone': None, u'id': 89827370, u'default_profile': False, u'verified': False, u'profile_text_color': u'000000', u'profile_image_url_https': u'https://pbs.twimg.com/profile_images/644912966698037249/unhNPWuL_normal.png', u'profile_sidebar_fill_color': u'000000', u'is_translator': False, u'geo_enabled': True, u'entities': {u'url': {u'urls': [{u'url': u'http://t.co/OJFHBaiwWO', u'indices': [0, 22], u'expanded_url': u'http://www.beach-head.com', u'display_url': u'beach-head.com'}]}, u'description': {u'urls': []}}, u'followers_count': 82, u'protected': False, u'id_str': u'89827370', u'default_profile_image': False, u'listed_count': 8, u'lang': u'en', u'utc_offset': None, u'statuses_count': 2508, u'description': u'Beachhead is a Premier IT recruiting firm based in Toronto, Canada. Follow for exciting opportunities in Financial, Retail and Telecommunication sector.\U0001f600', u'friends_count': 59, u'profile_link_color': u'0570B3', u'profile_image_url': u'http://pbs.twimg.com/profile_images/644912966698037249/unhNPWuL_normal.png', u'notifications': None, u'profile_background_image_url_https': u'https://abs.twimg.com/images/themes/theme1/bg.png', u'profile_background_color': u'000000', u'profile_banner_url': u'https://pbs.twimg.com/profile_banners/89827370/1442594156', u'profile_background_image_url': u'http://abs.twimg.com/images/themes/theme1/bg.png', u'name': u'BeachHead', u'is_translation_enabled': False, u'profile_background_tile': False, u'favourites_count': 19, u'screen_name': u'BeachHeadINC', u'url': u'http://t.co/OJFHBaiwWO', u'created_at': u'Sat Nov 14 00:02:15 +0000 2009', u'contributors_enabled': False, u'location': u'Toronto, Canada', u'profile_sidebar_border_color': u'000000', u'translator_type': u'none', u'following': None}, u'geo': None, u'in_reply_to_user_id_str': None, u'possibly_sensitive': False, u'lang': u'en', u'created_at': u'Thu May 02 20:04:25 +0000 2019', u'in_reply_to_status_id_str': None, u'place': None, u'metadata': {u'iso_language_code': u'en', u'result_type': u'recent'}} Code available in: https://github.com/PedroAndrade89/kafka_twitter
... View more
Labels:
04-16-2019
03:13 PM
2 Kudos
Steps on how to setup YARN to run docker containers can be found in Part 1: article In this article I will show how to run Hive components (Hiverserver2, Metastore) as docker containers in YARN. Metastore will be using a Mysql 8 database also running as a docker container in a local host. Pre-requisites: 1. Pull mysql-server image from Docker hub, run image as a docker container, create hive database and permissions for hive user : docker pull mysql/mysql-server
docker run -d -p 3306:3306 -e MYSQL_ROOT_PASSWORD=admin --restart=always --name mysqld mysql/mysql-server
docker exec -it mysqld bash
bash-4.2# mysql -u root --password=admin
mysql> CREATE DATABASE hive;
mysql> CREATE USER 'hive' IDENTIFIED BY 'hive';
mysql> GRANT ALL PRIVILEGES ON hive.* TO 'hive'@'%' WITH GRANT OPTION; 2. Create user 'hive' and assign to 'hadoop' group: useradd hive
usermod -aG hadoop hive Dockerize Hive: 1. Create a yum repo file "hdp.repo" that contains HDP-3.1.0.0 and HDP-UTILS-1.1.0.22 repositories: [HDP-3.1.0.0] name=HDP Version - HDP-3.1.0.0 baseurl=http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.0.0 gpgcheck=1 gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.0.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins enabled=1 priority=1 [HDP-UTILS-1.1.0.22] name=HDP-UTILS Version - HDP-UTILS-1.1.0.22 baseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.22/repos/centos7 gpgcheck=1 gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.0.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins enabled=1 priority=1 2. Create the Dockerfile: FROM centos:7
ENV JAVA_HOME /usr/lib/jvm/jre-1.8.0-openjdk
COPY hdp.repo /etc/yum.repos.d/
COPY mysql-connector-java-8.0.14-1.el7.noarch.rpm /root/
RUN yum updateinfo \
&& yum install -y sudo java-1.8.0-openjdk-devel hadoop-yarn hadoop-mapreduce hive hive-metastore tez \
&& yum clean all
RUN yum localinstall -y /root/mysql-connector-java-8.0.14-1.el7.noarch.rpm
RUN cp /usr/share/java/mysql-connector-java-8.0.14.jar /usr/hdp/3.1.0.0-78/hive/lib/mysql-connector-java.jar Note: For the metastore to connect to our Mysql database we need a JDBC connector. I've downloaded it from here and copied the .rpm file to the same directory as my Dockerfile, so it is installed on the image. 3. Build the image: docker build -t hive . 4. Tag the image and push it to the docker local registry: Tag the image as “<docker registry server>:5000/hive_local”. This creates an additional tag for the existing image. When the first part of the tag is a hostname and port, Docker interprets this as the location of a registry. docker tag hive <docker registry server>:5000/hive_local docker push <docker registry server>:5000/hive_local Now that our hive image is created we will create a Yarn Service configuration file (Yarnfile) with all the details of our service. Deployment: 1. Copy core-site.xml, hdfs-site.xml and yarn-site.xml to hive user dir in HDFS: su - hive hdfs dfs -copyFromLocal /etc/hadoop/conf/core-site.xml . hdfs dfs -copyFromLocal /etc/hadoop/conf/hdfs-site.xml . hdfs dfs -copyFromLocal /etc/hadoop/conf/yarn-site.xml . 2. Create YarnFile (hive.json): {
"name": "hive",
"lifetime": "-1",
"version": "3.1.0.3.1.0.0",
"artifact": {
"id": "<docker registry server>:5000/hive2",
"type": "DOCKER"
},
"configuration": {
"env": {
"HIVE_LOG_DIR": "var/log/hive",
"YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS": "/etc/passwd:/etc/passwd:ro,/etc/group:/etc/group:ro",
"HADOOP_HOME": "/usr/hdp/3.1.0.0-78/hadoop"
},
"properties": {
"docker.network": "host"
},
"files": [
{
"type": "TEMPLATE",
"dest_file": "/etc/hadoop/conf/core-site.xml",
"src_file": "core-site.xml"
},
{
"type": "TEMPLATE",
"dest_file": "/etc/hadoop/conf/yarn-site.xml",
"src_file": "yarn-site.xml"
},
{
"type": "TEMPLATE",
"dest_file": "/etc/hadoop/conf/hdfs-site.xml",
"src_file": "hdfs-site.xml"
},
{
"type": "XML",
"dest_file": "/etc/hive/conf/hive-site.xml",
"properties": {
"hive.zookeeper.quorum": "${CLUSTER_ZK_QUORUM}",
"hive.zookeeper.namespace": "hiveserver2",
"hive.server2.zookeeper.publish.configs": "true",
"hive.server2.support.dynamic.service.discovery": "true",
"hive.support.concurrency": "true",
"hive.metastore.warehouse.dir": "/user/${USER}/warehouse",
"javax.jdo.option.ConnectionUserName": "hive",
"javax.jdo.option.ConnectionPassword": "hive",
"hive.server2.enable.doAs": "false",
"hive.metastore.schema.verification": "true",
"hive.metastore.db.type": "MYSQL",
"javax.jdo.option.ConnectionDriverName": "com.mysql.jdbc.Driver",
"javax.jdo.option.ConnectionURL": "jdbc:mysql://<mysql-server docker host>:3306/hive?createDatabaseIfNotExist=true",
"hive.metastore.event.db.notification.api.auth" : "false",
"hive.metastore.uris": "thrift://hivemetastore-0.${SERVICE_NAME}.${USER}.${DOMAIN}:9083"
}
}
]
},
"components": [
{
"name": "hiveserver2",
"number_of_containers": 1,
"launch_command": "sleep 25; /usr/hdp/current/hive-server2/bin/hiveserver2",
"resource": {
"cpus": 1,
"memory": "1024"
},
"configuration": {
"files": [
{
"type": "XML",
"dest_file": "/etc/hive/conf/hive-site.xml",
"properties": {
"hive.server2.thrift.bind.host": "${COMPONENT_INSTANCE_NAME}.${SERVICE_NAME}.${USER}.${DOMAIN}",
"hive.server2.thrift.port": "10000",
"hive.server2.thrift.http.port": "10001"
}
}
],
"env": {
"HADOOP_OPTS": "-Xmx1024m -Xms512m"
}
}
},
{
"name": "hivemetastore",
"number_of_containers": 1,
"launch_command": "sleep 5;/usr/hdp/current/hive-metastore/bin/schematool -initSchema -dbType mysql;/usr/hdp/current/hive-metastore/bin/hive --service metastore",
"resource": {
"cpus": 1,
"memory": "1024"
},
"configuration": {
"files": [
{
"type": "XML",
"dest_file": "/etc/hive/conf/hive-site.xml",
"properties": {
"hive.metastore.uris": "thrift://${COMPONENT_INSTANCE_NAME}.${SERVICE_NAME}.${USER}.${DOMAIN}:9083"
}
}
],
"env": {
"HADOOP_OPTS": "-Xmx1024m -Xms512m"
}
}
}
]
} 3. Deploy application using the YARN Services API: yarn app launch -hive hive.json Test access to Hive: The Registry DNS service that runs on the cluster listens for inbound DNS requests. Those requests are standard DNS requests from users or other DNS servers (for example, DNS servers that have the RegistryDNS service configured as a forwarder) If we have this setup, we can connect via beeline to "${COMPONENT_INSTANCE_NAME}.${SERVICE_NAME}.${USER}.${DOMAIN}" hostname as long as our client is using the corporate DNS server. More details here Because in this test we don't have this configured we need to manually find were the hiveserver2 docker container is running with: curl -X GET 'http://<RM-host>:8088/app/v1/services/hive?user.name=hive' | python -m json.tool On the containers information for component hiveserver2-0 we will find: "containers": [
{
"bare_host": "<hiveserver2-0 host hostname>", On the host were the container is running connect to hive via beeline: su - hive
beeline -u "jdbc:hive2://<hostname -I>:10000/default" References: https://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/yarn-service/Configurations.html http://hadoop.apache.org/docs/r3.1.0/hadoop-yarn/hadoop-yarn-site/yarn-service/RegistryDNS.html https://hadoop.apache.org/docs/r3.1.0/hadoop-yarn/hadoop-yarn-site/yarn-service/YarnServiceAPI.html Files are also available in the following GitHub repo: https://github.com/PedroAndrade89/docker_hdp_services.git
... View more
03-22-2019
03:22 PM
3 Kudos
Pre-requisites: 1. Install Docker in all nodemanager hosts and configure: It is recommended to install the version of Docker that is provided by your Operating System vendor. The Docker package has been known by several names; docker-engine, docker, and docker-ce. yum install docker If having issues installing docker, following document can be followed: https://docs.docker.com/install/ Edit ‘/etc/docker/daemon.json’ and add the following options: {
"live-restore" : true,
"debug" : true,
"dns": ["<YARN registry dns ip addr>"]
} If not using HTTPS, Configure each of the cluster hosts to skip HTTPS checks by adding following line in ‘/etc/docker/daemon.json’ "insecure-registries": ["<docker registry server>:5000"] 2. Create a private local docker registry(Optional) 2.1- Designate a server in the cluster for use by the Docker registry. Minimal resources are required, but sufficient disk space is needed to store the images and metadata. Docker must be installed and running. 2.2 Start the registry docker run -d -p 5000:5000 --restart=always --name registry registry:2 2.3 (Optional)By default, data will only be persisted within the container. If you would like to persist the data on the host, you can customize the bind mounts using the -v option: docker run -d -p 5000:5000 --restart=always -v /host_registry_path:/var/lib/registry --name registry registry:2 3. Configure YARN to run Docker containers: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/data-operating-system/content/configure_yarn_for_running_docker_containers.html Our cluster is now ready to run Dockerized applications. Dockerize HBASE: 1. Create a yum repo file "hdp.repo" that contains HDP-3.1.0.0 and HDP-UTILS-1.1.0.22 repositories: [HDP-3.1.0.0]
name=HDP Version - HDP-3.1.0.0
baseurl=http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.0.0
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.0.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1
[HDP-UTILS-1.1.0.22]
name=HDP-UTILS Version - HDP-UTILS-1.1.0.22
baseurl=http://public-repo-1.hortonworks.com/HDP-UTILS-1.1.0.22/repos/centos7
gpgcheck=1
gpgkey=http://public-repo-1.hortonworks.com/HDP/centos7/3.x/updates/3.1.0.0/RPM-GPG-KEY/RPM-GPG-KEY-Jenkins
enabled=1
priority=1 2. Create the Dockerfile: FROM centos:7
ENV JAVA_HOME /usr/lib/jvm/jre-1.8.0-openjdk
COPY hdp.repo /etc/yum.repos.d/
RUN yum updateinfo && yum install -y sudo java-1.8.0-openjdk-devel hbase phoenix hadoop-yarn hadoop-mapreduce && yum clean all 3. Build the image: docker build -t hbase . 4. Tag the image and push it to the docker local registry: Tag the image as “<docker registry server>:5000/hbase_local”. This creates an additional tag for the existing image. When the first part of the tag is a hostname and port, Docker interprets this as the location of a registry docker tag hbase <docker registry server>:5000/hbase_local
docker push <docker registry server>:5000/hbase_local Now that our hbase image is created we will create a Yarn Service configuration file (Yarnfile) with all the details of our service Deployment: 1. Copy core-site.xml, hdfs-site.xml to user dir in HDFS: su - ambari-qa
hdfs dfs -copyFromLocal /etc/hadoop/conf/core-site.xml .
hdfs dfs -copyFromLocal /etc/hadoop/conf/hdfs-site.xml . 2. Create YarnFile (hbase.json): {
"name": "hbase",
"lifetime": "10800",
"version": "2.0.2.3.1.0.0",
"artifact": {
"id": "<docker registry server>:5000/hbase_local",
"type": "DOCKER"
},
"configuration": {
"env": {
"HBASE_LOG_DIR": "var/log/hbase",
"HADOOP_HOME": "/usr/hdp/3.1.0.0-78/hadoop"
},
"properties": {
"docker.network": "host"
},
"files": [
{
"type": "TEMPLATE",
"dest_file": "/etc/hadoop/conf/core-site.xml",
"src_file": "core-site.xml"
},
{
"type": "TEMPLATE",
"dest_file": "/etc/hadoop/conf/hdfs-site.xml",
"src_file": "hdfs-site.xml"
},
{
"type": "XML",
"dest_file": "/etc/hbase/conf/hbase-site.xml",
"properties": {
"hbase.cluster.distributed": "true",
"hbase.zookeeper.quorum": "${CLUSTER_ZK_QUORUM}",
"hbase.rootdir": "${SERVICE_HDFS_DIR}/hbase",
"zookeeper.znode.parent": "${SERVICE_ZK_PATH}",
"hbase.master.hostname": "hbasemaster-0.${SERVICE_NAME}.${USER}.${DOMAIN}",
"hbase.master.info.port": "16010"
}
}
]
},
"components": [
{
"name": "hbasemaster",
"number_of_containers": 1,
"launch_command": "sleep 15; /usr/hdp/current/hbase-master/bin/hbase master start",
"resource": {
"cpus": 1,
"memory": "1024"
},
"readiness_check": {
"type": "HTTP",
"properties": {
"url": "http://${THIS_HOST}:16010/master-status"
}
},
"configuration": {
"env": {
"HBASE_MASTER_OPTS": "-Xmx1024m -Xms512m"
}
}
},
{
"name": "regionserver",
"number_of_containers": 3,
"launch_command": "sleep 15; /usr/hdp/current/hbase-regionserver/bin/hbase regionserver start",
"resource": {
"cpus": 1,
"memory": "512"
},
"configuration": {
"files": [
{
"type": "XML",
"dest_file": "/etc/hbase/conf/hbase-site.xml",
"properties": {
"hbase.cluster.distributed": "true",
"hbase.zookeeper.quorum": "${CLUSTER_ZK_QUORUM}",
"hbase.rootdir": "${SERVICE_HDFS_DIR}/hbase",
"zookeeper.znode.parent": "${SERVICE_ZK_PATH}",
"hbase.master.hostname": "hbasemaster-0.${SERVICE_NAME}.${USER}.${DOMAIN}",
"hbase.master.info.port": "16010",
"hbase.regionserver.info.port": "16020",
"hbase.regionserver.port": "16030",
"hbase.regionserver.hostname": "${COMPONENT_INSTANCE_NAME}.${SERVICE_NAME}.${USER}.${DOMAIN}"
}
}
],
"env": {
"HBASE_REGIONSERVER_OPTS": "-XX:CMSInitiatingOccupancyFraction=70 -Xmx512m -Xms256m"
}
}
},
{
"name": "hbaseclient",
"number_of_containers": 1,
"launch_command": "sleep infinity",
"resource": {
"cpus": 1,
"memory": "512"
}
}
],
"quicklinks": {
"HBase Master Status UI": "http://hbasemaster-0.${SERVICE_NAME}.${USER}.${DOMAIN}:16010/master-status"
}
} 3. Deploy application using the YARN Services API: yarn app -launch hbase hbase.json 4. Go to "Services" in YARN RM UI and select hbase: We have 1 hbase master, 3 regionservers and 1 hbaseclient components running. 5. We can also use the Yarn Service REST API to get the state of our hbase service: curl -X GET 'http://<RM host>:8088/app/v1/services/hbase?user.name=ambari-qa' | python -m json.tool {
"artifact": {
"id": "<docker registry server>:5000/hbase_local",
"type": "DOCKER"
},
"components": [
{
"artifact": {
"id": "<docker registry server>:5000/hbase_local",
"type": "DOCKER"
},
"configuration": {
"env": {
"HADOOP_HOME": "/usr/hdp/3.1.0.0-78/hadoop",
"HBASE_LOG_DIR": "var/log/hbase",
"HBASE_MASTER_OPTS": "-Xmx1024m -Xms512m"
},
"files": [
{
"dest_file": "/etc/hbase/conf/hbase-site.xml",
"properties": {
"hbase.cluster.distributed": "true",
"hbase.master.hostname": "hbasemaster-0.${SERVICE_NAME}.${USER}.${DOMAIN}",
"hbase.master.info.port": "16010",
"hbase.rootdir": "${SERVICE_HDFS_DIR}/hbase",
"hbase.zookeeper.quorum": "${CLUSTER_ZK_QUORUM}",
"zookeeper.znode.parent": "${SERVICE_ZK_PATH}"
},
"type": "XML"
},
{
"dest_file": "/etc/hadoop/conf/core-site.xml",
"properties": {},
"src_file": "core-site.xml",
"type": "TEMPLATE"
},
{
"dest_file": "/etc/hadoop/conf/hdfs-site.xml",
"properties": {},
"src_file": "hdfs-site.xml",
"type": "TEMPLATE"
}
],
"properties": {
"docker.network": "host"
}
},
"containers": [
{
"bare_host": "pandrade-2.openstacklocal.com",
"component_instance_name": "hbasemaster-0",
"hostname": "hbasemaster-0.hbase.ambari-qa.OPENSTACKLOCAL.COM",
"id": "container_e21_1553187523351_0006_01_000002",
"ip": "172.26.81.14",
"launch_time": 1553266732354,
"state": "READY"
}
],
"dependencies": [],
"launch_command": "sleep 15; /usr/hdp/current/hbase-master/bin/hbase master start",
"name": "hbasemaster",
"number_of_containers": 1,
"quicklinks": [],
"readiness_check": {
"properties": {
"url": "http://${THIS_HOST}:16010/master-status"
},
"type": "HTTP"
},
"resource": {
"additional": {},
"cpus": 1,
"memory": "1024"
},
"restart_policy": "ALWAYS",
"run_privileged_container": false,
"state": "STABLE"
},
{
"artifact": {
"id": "pandrade-4:5000/hbase_local",
"type": "DOCKER"
},
"configuration": {
"env": {
"HADOOP_HOME": "/usr/hdp/3.1.0.0-78/hadoop",
"HBASE_LOG_DIR": "var/log/hbase",
"HBASE_REGIONSERVER_OPTS": "-XX:CMSInitiatingOccupancyFraction=70 -Xmx512m -Xms256m"
},
"files": [
{
"dest_file": "/etc/hbase/conf/hbase-site.xml",
"properties": {
"hbase.cluster.distributed": "true",
"hbase.master.hostname": "hbasemaster-0.${SERVICE_NAME}.${USER}.${DOMAIN}",
"hbase.master.info.port": "16010",
"hbase.regionserver.hostname": "${COMPONENT_INSTANCE_NAME}.${SERVICE_NAME}.${USER}.${DOMAIN}",
"hbase.regionserver.info.port": "16020",
"hbase.regionserver.port": "16030",
"hbase.rootdir": "${SERVICE_HDFS_DIR}/hbase",
"hbase.zookeeper.quorum": "${CLUSTER_ZK_QUORUM}",
"zookeeper.znode.parent": "${SERVICE_ZK_PATH}"
},
"type": "XML"
},
{
"dest_file": "/etc/hadoop/conf/core-site.xml",
"properties": {},
"src_file": "core-site.xml",
"type": "TEMPLATE"
},
{
"dest_file": "/etc/hadoop/conf/hdfs-site.xml",
"properties": {},
"src_file": "hdfs-site.xml",
"type": "TEMPLATE"
}
],
"properties": {
"docker.network": "host"
}
},
"containers": [
{
"bare_host": "pandrade-2.openstacklocal.com",
"component_instance_name": "regionserver-2",
"hostname": "regionserver-2.hbase.ambari-qa.OPENSTACKLOCAL.COM",
"id": "container_e21_1553187523351_0006_01_000005",
"ip": "172.26.81.14",
"launch_time": 1553266732359,
"state": "READY"
},
{
"bare_host": "pandrade-4.openstacklocal.com",
"component_instance_name": "regionserver-0",
"hostname": "regionserver-0.hbase.ambari-qa.OPENSTACKLOCAL.COM",
"id": "container_e21_1553187523351_0006_01_000003",
"ip": "172.26.81.15",
"launch_time": 1553266732358,
"state": "READY"
},
{
"bare_host": "pandrade-3.openstacklocal.com",
"component_instance_name": "regionserver-1",
"hostname": "regionserver-1.hbase.ambari-qa.OPENSTACKLOCAL.COM",
"id": "container_e21_1553187523351_0006_01_000004",
"ip": "172.26.81.13",
"launch_time": 1553266732358,
"state": "READY"
}
],
"dependencies": [],
"launch_command": "sleep 15; /usr/hdp/current/hbase-regionserver/bin/hbase regionserver start",
"name": "regionserver",
"number_of_containers": 3,
"quicklinks": [],
"resource": {
"additional": {},
"cpus": 1,
"memory": "512"
},
"restart_policy": "ALWAYS",
"run_privileged_container": false,
"state": "STABLE"
},
{
"artifact": {
"id": "pandrade-4:5000/hbase_local",
"type": "DOCKER"
},
"configuration": {
"env": {
"HADOOP_HOME": "/usr/hdp/3.1.0.0-78/hadoop",
"HBASE_LOG_DIR": "var/log/hbase"
},
"files": [
{
"dest_file": "/etc/hbase/conf/hbase-site.xml",
"properties": {
"hbase.cluster.distributed": "true",
"hbase.master.hostname": "hbasemaster-0.${SERVICE_NAME}.${USER}.${DOMAIN}",
"hbase.master.info.port": "16010",
"hbase.rootdir": "${SERVICE_HDFS_DIR}/hbase",
"hbase.zookeeper.quorum": "${CLUSTER_ZK_QUORUM}",
"zookeeper.znode.parent": "${SERVICE_ZK_PATH}"
},
"type": "XML"
},
{
"dest_file": "/etc/hadoop/conf/core-site.xml",
"properties": {},
"src_file": "core-site.xml",
"type": "TEMPLATE"
},
{
"dest_file": "/etc/hadoop/conf/hdfs-site.xml",
"properties": {},
"src_file": "hdfs-site.xml",
"type": "TEMPLATE"
}
],
"properties": {
"docker.network": "host"
}
},
"containers": [
{
"bare_host": "pandrade-4.openstacklocal.com",
"component_instance_name": "hbaseclient-0",
"hostname": "hbaseclient-0.hbase.ambari-qa.OPENSTACKLOCAL.COM",
"id": "container_e21_1553187523351_0006_01_000006",
"ip": "172.26.81.15",
"launch_time": 1553266732370,
"state": "READY"
}
],
"dependencies": [],
"launch_command": "sleep infinity",
"name": "hbaseclient",
"number_of_containers": 1,
"quicklinks": [],
"resource": {
"additional": {},
"cpus": 1,
"memory": "512"
},
"restart_policy": "ALWAYS",
"run_privileged_container": false,
"state": "STABLE"
}
],
"configuration": {
"env": {
"HADOOP_HOME": "/usr/hdp/3.1.0.0-78/hadoop",
"HBASE_LOG_DIR": "var/log/hbase"
},
"files": [
{
"dest_file": "/etc/hadoop/conf/core-site.xml",
"properties": {},
"src_file": "core-site.xml",
"type": "TEMPLATE"
},
{
"dest_file": "/etc/hadoop/conf/hdfs-site.xml",
"properties": {},
"src_file": "hdfs-site.xml",
"type": "TEMPLATE"
},
{
"dest_file": "/etc/hbase/conf/hbase-site.xml",
"properties": {
"hbase.cluster.distributed": "true",
"hbase.master.hostname": "hbasemaster-0.${SERVICE_NAME}.${USER}.${DOMAIN}",
"hbase.master.info.port": "16010",
"hbase.rootdir": "${SERVICE_HDFS_DIR}/hbase",
"hbase.zookeeper.quorum": "${CLUSTER_ZK_QUORUM}",
"zookeeper.znode.parent": "${SERVICE_ZK_PATH}"
},
"type": "XML"
}
],
"properties": {
"docker.network": "host"
}
},
"id": "application_1553187523351_0006",
"kerberos_principal": {},
"lifetime": 10200,
"name": "hbase",
"quicklinks": {
"HBase Master Status UI": "http://hbasemaster-0.hbase.ambari-qa.OPENSTACKLOCAL.COM:16010/master-status"
},
"state": "STABLE",
"version": "2.0.2.3.1.0.0"
} Our Hbase service is stable and all docker containers in Ready state Hbase master UI: Find in which host the hbasemaster container is running and access the UI <hbase master container host>:16010/master-status References: https://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/yarn-service/Configurations.html http://hadoop.apache.org/docs/r3.1.0/hadoop-yarn/hadoop-yarn-site/yarn-service/RegistryDNS.html https://hadoop.apache.org/docs/r3.1.0/hadoop-yarn/hadoop-yarn-site/yarn-service/YarnServiceAPI.html Files are also available in the following GitHub repo: https://github.com/PedroAndrade89/docker_hdp_services.git
... View more
Labels:
08-20-2018
07:58 PM
@Sankar T
The best place to start is the ambari-server logs. These can usually be found on: /var/log/ambari-server/ambari-server.log Try to look at these logs (commands cat,less,more,etc) around the time that your ambari-server is going down to find more details
... View more
08-20-2018
12:57 PM
@Danninger Each block of data has at least 3 replicas across the other nodes (depending on your configuration). In your particular case when you brought the cluster back up the Namenode would be expecting x blocks of data to be on the node it is shutdown. Regardless of the ambari-agent being running or not, when you started it up it sent a block report to the namenode. If the Namenode "sees" in that block report that original blocks of data(before you shutdown the cluster) are missing, it will simply replicate these block from a healthy data node to other nodes. So in this example a block needs to have 3 replicas across the cluster. If When receiving all block reports from all data node, the Namenode sees that certain blocks are not compliant with that rule, then it will replicate those blocks to other healthy nodes automatically About starting back up the nodes you are safe to do it as HDFS is prepared to "deal" with this kind of situation. One thing to look at is if you setup local directories on the failed mountpoint for the services that were running on the node.Make sure that is not the case, and if yes you are ok to startup again the services and ambari-agent.
... View more
08-18-2018
10:20 PM
@ranjith ranjith /boot/efi/hadoop/hdfs/namenode is not a valid Namenode directory.
To correct this go to Ambari > HDFS > Config
Then change the value of the namenode directory from:
/boot/efi/hadoop/hdfs/namenode
to: /hadoop/hdfs/namenode Please let me know how it goes
... View more
07-25-2018
09:38 AM
@chen fan I believe that the issue lies on the repo files, are you using an internal repo server or you have connection to the internet? Our repo points to the external mysql-community libs to download. You may need to pre-install the correct version of mysql Can you show me the contents of /etc/yum.repos.d/* ?
... View more
07-25-2018
08:30 AM
@chen fan
What is the output of the following commands? hdp-select | grep timelineserver
ls -lh /usr/hdp/current/hadoop-yarn-timelineserver
... View more
07-24-2018
07:03 PM
@john liverpool Can you check port 8080 of the host where the ambari-server is running? telnet <ambariHost> 8080 Can you connect to it? If not it means that the port is probably being blocked by a firewall. If you can disable it do: service iptables stop In the browser are you using the host ip address of FQDN to access the GUI? Try with both and see if you can access the GUI. If you can only access with ip address you can add the host entry in /etc/hosts: <ambari-server host ip> <ambari-server fqdn> <hostname> Let me know if it helped solving your issue
... View more
07-24-2018
05:58 PM
@Rambabu Chamakuri It seems that the permissions on the VERSION file are wrong, see below: java.io.FileNotFoundException: /mnt/dn/sdl/datanode/current/VERSION (Permission denied) Check the permission on this VERSION file: ls -lh /mnt/dn/sdl/datanode/current/VERSION The file should be owned by "hdfs:hdfs" and permissions set to 644. If they are not then change accordingly: chown hdfs:hdfs /mnt/dn/sdl/datanode/current/VERSION
chmod 644 /mnt/dn/sdl/datanode/current/VERSION And restart the Datanode. Let me know if it help in solving your issue
... View more