Member since
07-19-2018
613
Posts
101
Kudos Received
117
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4901 | 01-11-2021 05:54 AM | |
3336 | 01-11-2021 05:52 AM | |
8641 | 01-08-2021 05:23 AM | |
8157 | 01-04-2021 04:08 AM | |
36032 | 12-18-2020 05:42 AM |
12-09-2019
04:34 AM
@shubham268 please accept the solution, to make this question as answered. If you have any other questions ask a new one and we will respond accordingly. Thanks for the Kudos too!
... View more
12-05-2019
04:49 PM
@emanueol as of this writing, I believe that XML attachments aren't allowed, but you can cut and paste the XML source code into the body of a post using the Insert/edit code sample feature of the editor.
... View more
12-04-2019
12:04 PM
Are the HDP/HDFs going to get forward movement for RHEL 7.7?
... View more
11-27-2019
12:00 AM
I solved the issue with increasing the Request expiration which was set to 1 min .
... View more
11-25-2019
11:45 PM
Since getting started with Hadoop & ELK I have taken the original Hortonworks 5.x Elasticsearch and Kibana Management Pack for HDP 2.6.5 and upgraded it to ELK 6.3.2. During this project I also added Logstash, FileBeat, and MetricBeat. Next, I upgraded the Management Pack to work with HDP 3.x and HDF 3.x.
In this article I am going to share with you all of the steps required to upgrade the ELK 6.3.2 Management Pack to latest, greatest versions of ELK 7.4.2. You can find all of the files in my GitHub repo: DFHZ ELK Mpack
First, I create a known working test cluster making sure the 3.4-0 management pack is operational and my test environment is suitable.
Commands for a single node HDF test cluster:
# history
1 hostnamectl set-hostname elk.cloudera.com
2 yum install nano wget -y
3 nano /etc/sysconfig/selinux
4 nano /etc/cloud/cloud.cfg
5 nano /etc/hosts
6 reboot
9 wget -O /etc/yum.repos.d/ambari.repo http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.3.0/ambari.repo && yum --enablerepo=extras install epel-release -y && yum install nano java java-devel ambari-server ambari-agent -y && ambari-server setup -s && ambari-server install-mpack --mpack=http://public-repo-1.hortonworks.com/HDF/amazonlinux2/3.x/updates/3.4.1.1/tars/hdf_ambari_mp/hdf-ambari-mpack-3.4.1.1-4.tar.gz && ambari-server start && ambari-agent start
10 ssh-keygen
11 cat ~/.ssh/id_rsa
12 cat ~/.ssh/id_rsa.pub
13 nano ~/.ssh/authorized_keys
14 ssh root@elk.cloudera.com
15 ambari-server install-mpack --mpack=https://github.com/steven-dfheinz/dfhz_elk_mpack/raw/master/elasticsearch_mpack-3.4.0.0-0.tar.gz --verbose
29 ambari-server restart
35 python /var/lib/ambari-server/resources/scripts/configs.py -u admin -p admin -n ELK -l elk.cloudera.com -t 8080 -a set -c cluster-env -k ignore_groupsusers_create -v true
** note: before #35, complete the Cluster Install Wizard with base components (Zookeeper & Ambari Metrics). When Base Cluster is done, run the python command and install ELK Via Ambari Add Service Wizard. Without python command, ELK stack install will fail on user/group issues.
Second, I download then unpack the management pack above. I make a new version and begin to edit the file structure for "7.4.2". My edits were as follows:
Rename and version the archive folder elasticsearch_mpack-3.4.0.1-0
Change all component folders from 6.3.2 to 7.4.2
Update mpack.json from 6.3.2 to 7.4.2
Update Component File Set (Find/Replace) all 6.3.2 to 7.4.2
Update addon-services/ELASTICSEARCH/7.4.2/repos/repoinfo.xml to 7.x
Create a new archive elasticsearch_mpack-3.4.0.1-0.tar.gz
In this stage I just want to make sure that the stack versions are coming over correctly. I update the GitHub repo, uninstall the management pack, restart ambari, install the new management pack, restart ambari, and go to the Add Service from ambari. I can now see my ELK components reporting version 7.4.2. I complete the install to make sure everything installs, but I do not expect the services to run. There are still quite a few things to do to each components .yml config files.
Third, I start to work with Elasticsearch and Kibana config files. For these steps I need multi node cluster in order to have an Elasticsearch Master and a Slave. I use two local vagrant nodes to complete this task. First I spin up a single node, add the ELK 7.x repos, manually install each component, and grab a copy of the 7.x .yml files (elasticsearch.yml, logstash.yml, kibana.yml, filebeat.yml, metricbeat.yml). I save these for later.
Vagrant Commands For Manual Elk Install:
[root@elasticsearch]# history 1 cd /etc/yum.repos.d 2 yum install nano -y 3 nano elastic.repo 4 yum install elasticsearch logstash kibana filebeat metricbeat -y 5 cat /etc/elasticsearch/elasticsearch.yml 6 cat /etc/logstash/logstash.yml 7 cat /etc/kibana/kibana.yml 8 cat /etc/filebeat/filebeat.yml 9 cat /etc/metricbeat/metricbeat.yml
Next I spin up two nodes, an Ambari-Master with Agent, and another Agent only node.
Master Server Commands:
[root@c7302 ~]# history
1 wget -O /etc/yum.repos.d/ambari.repo http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.3.0/ambari.repo && yum --enablerepo=extras install epel-release -y && yum install nano java java-devel ambari-server ambari-agent -y && ambari-server setup -s && ambari-server install-mpack --mpack=http://public-repo-1.hortonworks.com/HDF/amazonlinux2/3.x/updates/3.4.1.1/tars/hdf_ambari_mp/hdf-ambari-mpack-3.4.1.1-4.tar.gz && ambari-server start && ambari-agent start
2 ambari-server install-mpack --mpack=https://github.com/steven-dfheinz/dfhz_elk_mpack/raw/master/elasticsearch_mpack-3.4.0.1-0.tar.gz --verbose
3. python /var/lib/ambari-server/resources/scripts/configs.py -u admin -p admin -n ELK -l c7302.ambari.apache.org -t 8080 -a set -c cluster-env -k ignore_groupsusers_create -v true
Agent Server Commands:
[root@c7303 vagrant]# history
1 wget -O /etc/yum.repos.d/ambari.repo http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.3.0/ambari.repo && yum --enablerepo=extras install epel-release -y && yum install nano java java-devel ambari-agent -y && mbari-agent start
Fourth, I complete Cluster Install Wizard for my base cluster. I then begin comparing the original Fileset config files with the new .yml I have saved. Inside of each component folder I rename the old .xml to .xml.6.3.2. I then make a new .xml with the new 7.x file contents. Next, I convert the required values, parameters, and settings using the variables set up in params.py. Some variables changed names completely. Most remained as is. Overall there are not many differences so the translation from 6.x to 7.x was pretty easy. If you need more detail here unpack the different versions in the GitHub repo and compare the filesets. Feel reach to reach out directly via a message or send a comment if you have questions here too.
To complete the config translation from 6.x to 7.x I worked with each component one at a time in a methodical process:
Complete Required .yml Changes in Fileset
Commit to GitHub
Uninstall/Re-Install the management pack using full GitHub Url
Restart Ambari
Add Service & Test
Adjust .yml on Node (until service runs as expected)
Stop & Remove Service From Ambari
Repeat
Install:
ambari-server install-mpack --mpack=https://github.com/steven-dfheinz/dfhz_elk_mpack/raw/master/elasticsearch_mpack-3.4.0.1-0.tar.gz --verbose ambari-server restart
Uninstall:
ambari-server uninstall-mpack --mpack-name=elasticsearch-ambari.mpack ambari-server restart
Last, I reset my entire cluster, and complete a full Base Cluster Install. Next, install the management pack, install the full ELK Stack and confirm there are no issues. To be thorough I also needed to complete this entire test in the HDP cluster too. During the work, I also identified some questionable configs and could easily spend many more days on this M-Pack. However, these are the only things required to make it work and allow deeper configuration changes within Ambari.
When complete you should have a fully working cluster with ELK:
Notes and Lessons Learned:
Work in small test-able loops
Always restart Ambari-Server after making management pack changes
Do not allow your editor or operating system to add/files inside of the management pack fileset. Example: DS_Store
Do not make accidental character changes in fileset which could invalidate XML or Python
Be careful of characters in .yml file content which invalidate XML. Example: <user>:<pass>
If things are not behaving as expected, blow away your current test-loop, and start a fresh one using a previous step.
Always keep old version of important files.
... View more
11-21-2019
04:52 PM
Thank you!!
... View more
11-20-2019
02:08 AM
I moved the content on another directory, restarted the namenode and the error message is gone. So I could delete the old directory
... View more
11-14-2019
06:52 AM
1 Kudo
With that big of a file, you are going to need to adjust settings (memory) and evaluate performance of the file repositories. I would recommend to start with small files, and work your way up. Complete a benchmark at each stage, then begin testing config changes. This should be a lengthy evaluation where in the end, you end up with a configuration capable of consuming 3 gb files.
... View more
11-14-2019
06:03 AM
1 Kudo
Not to take away from the entire conversation above, which in fact was very detailed and specific comparison. The major take away in your pro/con evaluation needs to be Physical Disk compared to Network or some level of shared Disk. Also in a big ha system there are usually more than one disk (not to mean more than one partition). When you go past dev and POC level benchmarking, deep into performance tuning, the Physical Disk, in high availability arrays, with a physical machine will out perform the Cloud or VMs for large volume and large data processes. To get more specific you have to compare all the nuts and bolts as well as evaluate the Performance Best Practices for each platform, service, component, all the way down to application design. This is a great debate and one that I have at every customer. That said I have led prod clusters installs in the cloud: Amazon, Azure, IBM Cloud, Google Cloud, and Private Cloud and VM systems.
... View more