Member since
01-19-2017
3679
Posts
632
Kudos Received
372
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 929 | 06-04-2025 11:36 PM | |
| 1538 | 03-23-2025 05:23 AM | |
| 762 | 03-17-2025 10:18 AM | |
| 2754 | 03-05-2025 01:34 PM | |
| 1818 | 03-03-2025 01:09 PM |
10-03-2018
09:01 PM
2 Kudos
@Lenu K Your question is rather wide for a small cluster all depends on manpower at hand, for HDF remember to back up the flow files, below are immediately what comes into my mind. Fresh Install pros and con's Better planned Here you get a clean installation maybe properly configured mistakes learned from the current cluster setup. Straightforward no upgrade surprises. Loose Customization Upgrade pros and cons' Must plan properly and document steps Expect technical surprises and challenge. Plan support if not having one already on the D-day Challenges mold you to a better hadoopist! See Mandatory Post-Upgrade Tasks Best practice Verify that the file system you selected is supported HWX Pre-create all the databases Backup your cluster before either of the above. Plan for at least NN/RM HA (NN are the brain so allocate good memory) MUST have 3 Zookeeper HDD planning is important SSD for SCSI Restrict access to the cluster from the ONLY edge node. Kerberize the Cluster Configure SSL think of SSD for Zk,Hbase and OS can also use the SSD acceleration for temp tables in hive, exposing the SSD via HDFS Plan well the Data center network(Backup lines) Size your nodes memory and storage properly. Beware if performance is a must especially with Kafka and Storm are memory intensive. Delegate authorization to Ranger. Test upgrade procedures for new versions of existing components Execute performance tests of custom-built applications Allow end-users to perform user acceptance testing Execute integration tests where custom-built applications communicate with third-party software Experiment with new software that is beta quality and may not be ready for usage at all Execute security penetration tests (typically done by an external company) Let application developers modify configuration parameters and restart services on short notice Maintain a mirror image of the production environment to be activated in case of natural disaster or unforeseen events Execute regression tests that compare the outputs of new application code with existing code running in production HTH
... View more
10-03-2018
07:05 AM
@Anurag Mishra If the response answered your question can you take time an login and "Accept" the answer and close the thread so other members can use it as a solution
... View more
09-13-2018
11:33 AM
1 Kudo
Hi @Ray Donovan, Glad that its resolved for you. I assume you used ambari-2.7 and installed 3.1.1 of HDF which was not in supported list of ambari-2.7 caused the issue. You can close the thread by marking the best answer.
... View more
09-06-2018
07:28 AM
You can do tail in namenode and datanode log, also you can redirect output to dummy log file during restart. #tailf <namenode log> >/tmp/namenode-`hostname`.log #tailf <datanode log> >/tmp/datanode-`hostname`.log
... View more
10-30-2018
11:08 AM
Properly setting up the nifi.security.identity.mapping.pattern.kerb and nifi.security.identity.mapping.pattern.dn fixed the problem. Also, while debugging these kind of problems, it's best to delete ranger plugin cache (under /etc/ranger/SERVICE_NAME/policycache/) to ensure that there are no communication problem between NiFi and Ranger.
... View more
08-27-2018
03:35 AM
Even if I made "yum clean", there wasn't slider_3_0_1_1_5 package. I still don't understand wahy and from where this package should come. Downloaded and searched in rpm packages (not in hdf 3.0.1 and earlier and later), this slider_* doesn't exists.
... View more
08-19-2018
03:43 AM
One is large environment with 20+ pb in size and data is completely different from other environment data, and reasons for different lakes are they both fall in different internal departments and data is also different and customers are also different, again depends on the data these cluster(s) servers located in different data centers and one is open for company wide enterprise network and others open for an internal network within enterprise network.
... View more
08-16-2018
11:02 AM
@Sudharsan Ganeshkumar You are not seeing anything because you are running the command as root user ! You will have to switch to the hive user and use hive or beeline # su - hive
$ hive Then at the prompt run the create statement hive> CREATE TABLE IF NOT EXISTS emp ( eid int, name String,
salary String, destination String)
COMMENT ‘Employee details’
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘\t’
LINES TERMINATED BY ‘\n’
STORED AS TEXTFILE; And then run hive> show table emp; HTH
... View more
08-13-2018
12:59 PM
@rinu shrivastav The split size is calculated by the formula:- max(mapred.min.split.size, min(mapred.max.split.size, dfs.block.size))
Say, HDFS block size is 64 MB and min.input.size is set to 128MB, then there will be split size would be 128MB. To read 256MB of data, there will be two mappers. To increase the number of mappers, then you could decrease min.input.size till the HDFS block size. split size=max(128,min(256,64))
... View more
07-30-2018
04:19 PM
There are 2 problems in the Ambari installation document for Ubuntu 16 at https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.0/bk_ambari-installation/content/download_the_ambari_repo_ubuntu16.html wget -O /etc/apt/sources.list.d/ambari.list http://public-repo-1.hortonworks.com/ambari/ubuntu16/2.x/updates//ambari.list
apt-key adv --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD
apt-get update As both Geoffrey and Jay had pointed out, the first sample command missing version number and document did not instruct to insert that for the execution For 2nd command to retrieve Linux key, the given format will result in timeout due to the syntax used. root@msl-dpe-perf77:/usr/local/Ambari# apt-key adv --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD
Executing: /tmp/tmp.1enMMyqYWS/gpg.1.sh --recv-keys
--keyserver
keyserver.ubuntu.com
B9733A7A07513CAD
gpg: requesting key 07513CAD from hkp server keyserver.ubuntu.com
gpg: keyserver timed out
gpg: keyserver receive failed: keyserver error
root@msl-dpe-perf77:/usr/local/Ambari# Following syntax will resolve this error root@msl-dpe-perf77:/usr/local/Ambari# apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv B9733A7A07513CAD
Executing: /tmp/tmp.PQTyyilAKr/gpg.1.sh --keyserver
hkp://keyserver.ubuntu.com:80
--recv
B9733A7A07513CAD
gpg: requesting key 07513CAD from hkp server keyserver.ubuntu.com
gpg: key 07513CAD: "Jenkins (HDP Builds) <jenkin@hortonworks.com>" not changed
gpg: Total number processed: 1
gpg: unchanged: 1
root@msl-dpe-perf77:/usr/local/Ambari#
... View more