About stevenmatison

stevenmatison · ‎12-09-2019

@shubham268 please accept the solution, to make this question as answered. If you have any other questions ask a new one and we will respond accordingly. Thanks for the Kudos too!

ask_bill_brooks · ‎12-05-2019

@emanueol as of this writing, I believe that XML attachments aren't allowed, but you can cut and paste the XML source code into the body of a post using the Insert/edit code sample feature of the editor.

stevenmatison · ‎12-04-2019

Are the HDP/HDFs going to get forward movement for RHEL 7.7?

AndyAjram · ‎11-27-2019

I solved the issue with increasing the Request expiration which was set to 1 min .

stevenmatison · ‎11-25-2019

Since getting started with Hadoop & ELK I have taken the original Hortonworks 5.x Elasticsearch and Kibana Management Pack for HDP 2.6.5 and upgraded it to ELK 6.3.2. During this project I also added Logstash, FileBeat, and MetricBeat. Next, I upgraded the Management Pack to work with HDP 3.x and HDF 3.x. In this article I am going to share with you all of the steps required to upgrade the ELK 6.3.2 Management Pack to latest, greatest versions of ELK 7.4.2. You can find all of the files in my GitHub repo: DFHZ ELK Mpack First, I create a known working test cluster making sure the 3.4-0 management pack is operational and my test environment is suitable. Commands for a single node HDF test cluster: # history 1 hostnamectl set-hostname elk.cloudera.com 2 yum install nano wget -y 3 nano /etc/sysconfig/selinux 4 nano /etc/cloud/cloud.cfg 5 nano /etc/hosts 6 reboot 9 wget -O /etc/yum.repos.d/ambari.repo http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.3.0/ambari.repo && yum --enablerepo=extras install epel-release -y && yum install nano java java-devel ambari-server ambari-agent -y && ambari-server setup -s && ambari-server install-mpack --mpack=http://public-repo-1.hortonworks.com/HDF/amazonlinux2/3.x/updates/3.4.1.1/tars/hdf_ambari_mp/hdf-ambari-mpack-3.4.1.1-4.tar.gz && ambari-server start && ambari-agent start 10 ssh-keygen 11 cat ~/.ssh/id_rsa 12 cat ~/.ssh/id_rsa.pub 13 nano ~/.ssh/authorized_keys 14 ssh root@elk.cloudera.com 15 ambari-server install-mpack --mpack=https://github.com/steven-dfheinz/dfhz_elk_mpack/raw/master/elasticsearch_mpack-3.4.0.0-0.tar.gz --verbose 29 ambari-server restart 35 python /var/lib/ambari-server/resources/scripts/configs.py -u admin -p admin -n ELK -l elk.cloudera.com -t 8080 -a set -c cluster-env -k ignore_groupsusers_create -v true ** note: before #35, complete the Cluster Install Wizard with base components (Zookeeper & Ambari Metrics). When Base Cluster is done, run the python command and install ELK Via Ambari Add Service Wizard. Without python command, ELK stack install will fail on user/group issues. Second, I download then unpack the management pack above. I make a new version and begin to edit the file structure for "7.4.2". My edits were as follows: Rename and version the archive folder elasticsearch_mpack-3.4.0.1-0 Change all component folders from 6.3.2 to 7.4.2 Update mpack.json from 6.3.2 to 7.4.2 Update Component File Set (Find/Replace) all 6.3.2 to 7.4.2 Update addon-services/ELASTICSEARCH/7.4.2/repos/repoinfo.xml to 7.x Create a new archive elasticsearch_mpack-3.4.0.1-0.tar.gz In this stage I just want to make sure that the stack versions are coming over correctly. I update the GitHub repo, uninstall the management pack, restart ambari, install the new management pack, restart ambari, and go to the Add Service from ambari. I can now see my ELK components reporting version 7.4.2. I complete the install to make sure everything installs, but I do not expect the services to run. There are still quite a few things to do to each components .yml config files. Third, I start to work with Elasticsearch and Kibana config files. For these steps I need multi node cluster in order to have an Elasticsearch Master and a Slave. I use two local vagrant nodes to complete this task. First I spin up a single node, add the ELK 7.x repos, manually install each component, and grab a copy of the 7.x .yml files (elasticsearch.yml, logstash.yml, kibana.yml, filebeat.yml, metricbeat.yml). I save these for later. Vagrant Commands For Manual Elk Install: [root@elasticsearch]# history 1 cd /etc/yum.repos.d 2 yum install nano -y 3 nano elastic.repo 4 yum install elasticsearch logstash kibana filebeat metricbeat -y 5 cat /etc/elasticsearch/elasticsearch.yml 6 cat /etc/logstash/logstash.yml 7 cat /etc/kibana/kibana.yml 8 cat /etc/filebeat/filebeat.yml 9 cat /etc/metricbeat/metricbeat.yml Next I spin up two nodes, an Ambari-Master with Agent, and another Agent only node. Master Server Commands: [root@c7302 ~]# history 1 wget -O /etc/yum.repos.d/ambari.repo http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.3.0/ambari.repo && yum --enablerepo=extras install epel-release -y && yum install nano java java-devel ambari-server ambari-agent -y && ambari-server setup -s && ambari-server install-mpack --mpack=http://public-repo-1.hortonworks.com/HDF/amazonlinux2/3.x/updates/3.4.1.1/tars/hdf_ambari_mp/hdf-ambari-mpack-3.4.1.1-4.tar.gz && ambari-server start && ambari-agent start 2 ambari-server install-mpack --mpack=https://github.com/steven-dfheinz/dfhz_elk_mpack/raw/master/elasticsearch_mpack-3.4.0.1-0.tar.gz --verbose 3. python /var/lib/ambari-server/resources/scripts/configs.py -u admin -p admin -n ELK -l c7302.ambari.apache.org -t 8080 -a set -c cluster-env -k ignore_groupsusers_create -v true Agent Server Commands: [root@c7303 vagrant]# history 1 wget -O /etc/yum.repos.d/ambari.repo http://public-repo-1.hortonworks.com/ambari/centos7/2.x/updates/2.7.3.0/ambari.repo && yum --enablerepo=extras install epel-release -y && yum install nano java java-devel ambari-agent -y && mbari-agent start Fourth, I complete Cluster Install Wizard for my base cluster. I then begin comparing the original Fileset config files with the new .yml I have saved. Inside of each component folder I rename the old .xml to .xml.6.3.2. I then make a new .xml with the new 7.x file contents. Next, I convert the required values, parameters, and settings using the variables set up in params.py. Some variables changed names completely. Most remained as is. Overall there are not many differences so the translation from 6.x to 7.x was pretty easy. If you need more detail here unpack the different versions in the GitHub repo and compare the filesets. Feel reach to reach out directly via a message or send a comment if you have questions here too. To complete the config translation from 6.x to 7.x I worked with each component one at a time in a methodical process: Complete Required .yml Changes in Fileset Commit to GitHub Uninstall/Re-Install the management pack using full GitHub Url Restart Ambari Add Service & Test Adjust .yml on Node (until service runs as expected) Stop & Remove Service From Ambari Repeat Install: ambari-server install-mpack --mpack=https://github.com/steven-dfheinz/dfhz_elk_mpack/raw/master/elasticsearch_mpack-3.4.0.1-0.tar.gz --verbose ambari-server restart Uninstall: ambari-server uninstall-mpack --mpack-name=elasticsearch-ambari.mpack ambari-server restart Last, I reset my entire cluster, and complete a full Base Cluster Install. Next, install the management pack, install the full ELK Stack and confirm there are no issues. To be thorough I also needed to complete this entire test in the HDP cluster too. During the work, I also identified some questionable configs and could easily spend many more days on this M-Pack. However, these are the only things required to make it work and allow deeper configuration changes within Ambari. When complete you should have a fully working cluster with ELK: Notes and Lessons Learned: Work in small test-able loops Always restart Ambari-Server after making management pack changes Do not allow your editor or operating system to add/files inside of the management pack fileset. Example: DS_Store Do not make accidental character changes in fileset which could invalidate XML or Python Be careful of characters in .yml file content which invalidate XML. Example: <user>:<pass> If things are not behaving as expected, blow away your current test-loop, and start a fresh one using a previous step. Always keep old version of important files.

stevenmatison · ‎11-21-2019

Thank you!!

Cico · ‎11-20-2019

I moved the content on another directory, restarted the namenode and the error message is gone. So I could delete the old directory

stevenmatison · ‎11-14-2019

With that big of a file, you are going to need to adjust settings (memory) and evaluate performance of the file repositories. I would recommend to start with small files, and work your way up. Complete a benchmark at each stage, then begin testing config changes. This should be a lengthy evaluation where in the end, you end up with a configuration capable of consuming 3 gb files.

stevenmatison · ‎11-14-2019

Not to take away from the entire conversation above, which in fact was very detailed and specific comparison. The major take away in your pro/con evaluation needs to be Physical Disk compared to Network or some level of shared Disk. Also in a big ha system there are usually more than one disk (not to mean more than one partition). When you go past dev and POC level benchmarking, deep into performance tuning, the Physical Disk, in high availability arrays, with a physical machine will out perform the Cloud or VMs for large volume and large data processes. To get more specific you have to compare all the nuts and bolts as well as evaluate the Performance Best Practices for each platform, service, component, all the way down to application design. This is a great debate and one that I have at every customer. That said I have led prod clusters installs in the cloud: Amazon, Azure, IBM Cloud, Google Cloud, and Private Cloud and VM systems.

ChampagneM12 · ‎11-12-2019

Alright! Thank you @MattWho

Online	Offline
Last Visited	‎06-01-2022 03:47 PM

Name	Steven Matison
Location	Florida
Member Since	‎07-19-2018 04:45 PM
Last Visited	‎06-01-2022 03:47 PM
Posts	613
Kudos received	101

Cloudera Community

Re: Apache nifi - how to convert a file .txt into ...

Re: Apache Nifi - Using PutParquet, the HDFS file ...

Re: How to extract csv column record and used it f...

Re: Could not connect to Distributed Map Cache ser...

Re: NiFi InvokeHTTP POST JSON

Re: Can i deploy more than one edge node?

Re: Unable to find CANVAS Processors/ControlerSerc...

Re: Does Cloudera support RHEL 7.7 version ?

Re: problem receiving Data on HandleHTTPRequest no...

How to Upgrade ELK Management Pack (ELK 6.3.2 to 7...

Re: Writing Articles: How to stop this app from a...

Re: The DataNode has 1 volume failure but disks se...

Re: Slow FTP downloads with NiFi

Re: Running a Cluster on Physical Servers V.S. on ...

Re: Nifi Cluster