About Shelton

Shelton · ‎10-03-2018

@Lenu K Your question is rather wide for a small cluster all depends on manpower at hand, for HDF remember to back up the flow files, below are immediately what comes into my mind. Fresh Install pros and con's Better planned Here you get a clean installation maybe properly configured mistakes learned from the current cluster setup. Straightforward no upgrade surprises. Loose Customization Upgrade pros and cons' Must plan properly and document steps Expect technical surprises and challenge. Plan support if not having one already on the D-day Challenges mold you to a better hadoopist! See Mandatory Post-Upgrade Tasks Best practice Verify that the file system you selected is supported HWX Pre-create all the databases Backup your cluster before either of the above. Plan for at least NN/RM HA (NN are the brain so allocate good memory) MUST have 3 Zookeeper HDD planning is important SSD for SCSI Restrict access to the cluster from the ONLY edge node. Kerberize the Cluster Configure SSL think of SSD for Zk,Hbase and OS can also use the SSD acceleration for temp tables in hive, exposing the SSD via HDFS Plan well the Data center network(Backup lines) Size your nodes memory and storage properly. Beware if performance is a must especially with Kafka and Storm are memory intensive. Delegate authorization to Ranger. Test upgrade procedures for new versions of existing components Execute performance tests of custom-built applications Allow end-users to perform user acceptance testing Execute integration tests where custom-built applications communicate with third-party software Experiment with new software that is beta quality and may not be ready for usage at all Execute security penetration tests (typically done by an external company) Let application developers modify configuration parameters and restart services on short notice Maintain a mirror image of the production environment to be activated in case of natural disaster or unforeseen events Execute regression tests that compare the outputs of new application code with existing code running in production HTH

Shelton · ‎10-03-2018

@Anurag Mishra If the response answered your question can you take time an login and "Accept" the answer and close the thread so other members can use it as a solution

akhilsnaik · ‎09-13-2018

Hi @Ray Donovan, Glad that its resolved for you. I assume you used ambari-2.7 and installed 3.1.1 of HDF which was not in supported list of ambari-2.7 caused the issue. You can close the thread by marking the best answer.

kpalanisamy · ‎09-06-2018

You can do tail in namenode and datanode log, also you can redirect output to dummy log file during restart. #tailf <namenode log> >/tmp/namenode-`hostname`.log #tailf <datanode log> >/tmp/datanode-`hostname`.log

rsg · ‎10-30-2018

Properly setting up the nifi.security.identity.mapping.pattern.kerb and nifi.security.identity.mapping.pattern.dn fixed the problem. Also, while debugging these kind of problems, it's best to delete ranger plugin cache (under /etc/ranger/SERVICE_NAME/policycache/) to ensure that there are no communication problem between NiFi and Ranger.

indrek_maestu · ‎08-27-2018

Even if I made "yum clean", there wasn't slider_3_0_1_1_5 package. I still don't understand wahy and from where this package should come. Downloaded and searched in rpm packages (not in hdf 3.0.1 and earlier and later), this slider_* doesn't exists.

ssanupindi · ‎08-19-2018

One is large environment with 20+ pb in size and data is completely different from other environment data, and reasons for different lakes are they both fall in different internal departments and data is also different and customers are also different, again depends on the data these cluster(s) servers located in different data centers and one is open for company wide enterprise network and others open for an internal network within enterprise network.

Shelton · ‎08-16-2018

@Sudharsan Ganeshkumar You are not seeing anything because you are running the command as root user ! You will have to switch to the hive user and use hive or beeline # su - hive $ hive Then at the prompt run the create statement hive> CREATE TABLE IF NOT EXISTS emp ( eid int, name String, salary String, destination String) COMMENT ‘Employee details’ ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘\t’ LINES TERMINATED BY ‘\n’ STORED AS TEXTFILE; And then run hive> show table emp; HTH

ssubhas · ‎08-13-2018

@rinu shrivastav The split size is calculated by the formula:- max(mapred.min.split.size, min(mapred.max.split.size, dfs.block.size)) Say, HDFS block size is 64 MB and min.input.size is set to 128MB, then there will be split size would be 128MB. To read 256MB of data, there will be two mappers. To increase the number of mappers, then you could decrease min.input.size till the HDFS block size. split size=max(128,min(256,64))

harry_li · ‎07-30-2018

There are 2 problems in the Ambari installation document for Ubuntu 16 at https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.2.0/bk_ambari-installation/content/download_the_ambari_repo_ubuntu16.html wget -O /etc/apt/sources.list.d/ambari.list http://public-repo-1.hortonworks.com/ambari/ubuntu16/2.x/updates//ambari.list apt-key adv --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD apt-get update As both Geoffrey and Jay had pointed out, the first sample command missing version number and document did not instruct to insert that for the execution For 2nd command to retrieve Linux key, the given format will result in timeout due to the syntax used. root@msl-dpe-perf77:/usr/local/Ambari# apt-key adv --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD Executing: /tmp/tmp.1enMMyqYWS/gpg.1.sh --recv-keys --keyserver keyserver.ubuntu.com B9733A7A07513CAD gpg: requesting key 07513CAD from hkp server keyserver.ubuntu.com gpg: keyserver timed out gpg: keyserver receive failed: keyserver error root@msl-dpe-perf77:/usr/local/Ambari# Following syntax will resolve this error root@msl-dpe-perf77:/usr/local/Ambari# apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv B9733A7A07513CAD Executing: /tmp/tmp.PQTyyilAKr/gpg.1.sh --keyserver hkp://keyserver.ubuntu.com:80 --recv B9733A7A07513CAD gpg: requesting key 07513CAD from hkp server keyserver.ubuntu.com gpg: key 07513CAD: "Jenkins (HDP Builds) <jenkin@hortonworks.com>" not changed gpg: Total number processed: 1 gpg: unchanged: 1 root@msl-dpe-perf77:/usr/local/Ambari#

Online	Offline
Last Visited	‎12-11-2025 11:50 PM

Member Since	‎01-19-2017 04:35 AM
Last Visited	‎12-11-2025 11:50 PM
Posts	3,679
Kudos received	627

Cloudera Community

Re: Apache nifi memory consumption in kubernetes

Re: Nifi toolkit command for GitLabFlowRegistry

Re: Not able to delete the NiFi existing flow usin...

Re: Securing Nifi with SSL and using OIDC provider...

Re: External zookeeper and nifi cluster connection...

Re: Upgrade or Fresh install

Re: how does ambari set password while creating ke...

Re: Need help installing HDF onto HDP 3.0. HDF Bas...

Re: HDFS is almost full 90% but data node disks ar...

Re: NiFi Authorization with Ranger in Kerberized e...

Re: how to get areound storm_3_0_1_1_5-slider-clie...

Re: Hortonworks Hadoop platform vs datalake (looki...

Re: I have created a table in hive. What is the co...

Re: Can we change no of Mappers for a MapReduce jo...

Re: Is Ambari public server down?