Member since
01-19-2017
3679
Posts
632
Kudos Received
372
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 908 | 06-04-2025 11:36 PM | |
| 1509 | 03-23-2025 05:23 AM | |
| 744 | 03-17-2025 10:18 AM | |
| 2683 | 03-05-2025 01:34 PM | |
| 1786 | 03-03-2025 01:09 PM |
06-04-2018
11:54 AM
1 Kudo
@Michael Bronson That's an annoying message I once got, but you can repost the same change the Header and delete the olf posting... it worked for me. Try the hack 🙂
... View more
06-04-2018
10:53 AM
1 Kudo
@Michael Bronson Both are recommended for running Kafka (XFS or ext4). XFS typically performs well with little tuning when compared to ext4 and it has become the default filesystem for many Linux distributions. XFS is a very high performance, scalable file system and is routinely deployed in the most demanding applications. It's RHEL 7 is the default file system and is supported on all architectures. XFS has its advantages but in a JBOD setup, it doesn't really provide a lot of benefits. Ext4 does not scale to the same size as XFS, is fully supported on all architectures and will still continue to see active development and support. See HCC Kafka KB Article Hope that helps!!!
... View more
06-04-2018
09:18 AM
Great to know your LLAP started !!!
... View more
06-04-2018
08:36 AM
@Erkan ŞİRİN Some deep dive in the setup could be worthy,please have a look at these 2 resources, they could be of help. https://community.hortonworks.com/articles/149486/llap-sizing-and-setup.html https://community.hortonworks.com/articles/149899/investigating-when-llap-doesnt-start.html
... View more
06-04-2018
07:46 AM
@Erkan ŞİRİN Your real problem is "[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed]" On an unsecure hadoop cluster with python 2.7.9 installed, slider agent fails with SSL validation errors Check if your python version is python 2.7.9 and Slider version is less than 0.92 https://issues.apache.org/jira/browse/SLIDER-942 If the above is correct then download the patch and try
... View more
06-03-2018
11:06 PM
@Samant Thakur Have you configured your cluster for rack awareness?
Rack awareness prevents data loss Rack awareness improves network performance HDFS block placement will use rack awareness for fault tolerance by placing one block replica on a different rack. This provides data availability in the event of a network switch failure or partition within the cluster. You will need the help of your network/data center team to share the network topology and how the nodes are spread out in the racks. You can use Ambari UI --> Hosts to set the rack topology after knowing the subnets and DC setup. To understand better see HDP rack awareness also see HCC rack-awareness-series-1 and HCC rack-awareness-series-2 Hope that helps
... View more
06-03-2018
06:41 PM
@Adi Jabkowsky Below is the procedure to remove the corrupt blocks or files Locate the files have blocks that are corrupt. $ hdfs fsck / | egrep -v '^\.+ or $ hdfs fsck hdfs://ip.or.host:50070/ | egrep -v '^\.+
This will be a list the affected files, and the output will not be a bunch of dots, the output should include something like this with all your affected files. Sample output /path/to/filename.file_extension: CORRUPT blockpool BP-1016133662-10.29.100.41-1415825958975 block blk_1073904305
/path/to/filename.file_extension: MISSING 1 blocks of total size 15620361 B The next step would be to determine the importance of the file, can it just be removed and copied back into place, or is there sensitive data that needs to be regenerated? you have a replication factor of 1 so analyze well. Remove the corrupted file(s) This command will move the corrupted file to the trash incase you realise the files is importantyou still have an option of recovering it . $ hdfs dfs -rm /path/to/filename.file_extension When you use skip the trash to permanently delete if you are sure you really don't need that file. $ hdfs dfs -rm -skipTrash /path/to/filename.file_extension How to repair a corrupted file if it was not easy to replace? $ hdfs fsck /path/to/filename/file_extension -locations -blocks -files or $ hdfs fsck hdfs://ip.or.hostname.of.namenode:50070/path/to/filename/file_extension -locations -blocks -files You can track down the datanode where the corruption is and look through logs and determine what the issue is. Please revert.
... View more
06-01-2018
09:54 PM
@Pankaj Singh Any updates? If you found an answer addressed your question, please take a moment to log
in and click the "accept" link on the answer.
... View more
06-01-2018
02:57 PM
@Pankaj Singh Not really I usually setup the MySQL databases and test connectivity before the cluster setup.
... View more
06-01-2018
02:03 PM
@Pankaj Singh Setting the cluster through Ambari admin does also create a cluster of MySQL server & hive server. (NO) You will need an RDBMS for storing the Hive metastore service that stores the metadata for Hive tables and partitions in a relational database Hive is a data warehouse software built on top of Hadoop for providing data summarization, query, and analysis. It gives a SQL-like interface to query data stored in HDFS. All queries go through the Hive metastore which translates SQL access to this information using the metastore service API When planning a robust cluster (production) you shouldn't use the derby database but one of the following Oracle,MySQL, MS SQL, MariaDB etc these databases should be setup before running ambari or during the Ambari server setup. These components will need a Relational database Ambari, Hive, Oozie, Ranger You can enable Hive metastore high availability (HA), so that your cluster is resilient to failures due to a metastore that becomes unavailable each being independent. see attached HiveMetaHA Steps of setting up Metadata databases
... View more