Member since
06-07-2016
923
Posts
322
Kudos Received
115
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 4099 | 10-18-2017 10:19 PM | |
| 4345 | 10-18-2017 09:51 PM | |
| 14851 | 09-21-2017 01:35 PM | |
| 1840 | 08-04-2017 02:00 PM | |
| 2424 | 07-31-2017 03:02 PM |
02-03-2017
08:09 PM
1 Kudo
HDP is Apache Hadoop and its suite of products (HDFS, MR, YARN, Zookeeper, HBase, Hive etc.). In the manual install you don't need Ambari. That's why it's all manual. Ambari manages HDP that it installs itself. If you do manual, then you need to other tools to monitor and manage it. When you do manual install and say "yum install hadoop hadoop-hdfs hadoop-libhdfs hadoop-yarn hadoop-mapreduce hadoop-client openssl", where do you think these packages are being installed from? It is the repo that you setup in the following step when you configure remote HDP repositories. So all this is actually part of HDP. http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_installing_manually_book/content/config-remote-repositories.html If you have a followup question, please add a comment instead of posting as a new answer.
... View more
02-03-2017
07:04 PM
Starting from version 2.1, you should see better read performance for colocated clients. For write, not so much. It's the read that will be faster because the client is on the same machine as the data block. If my answer helped, please accept.
... View more
02-03-2017
05:54 PM
whether the 'edge node' is a datanode? No. You can if you want, put edge processes like client configs to run client programs on the same node as data node but that doesn't make data node an edge node. Ideally this is not recommended but if you have very small cluster, then sure, no problem with that.
... View more
02-03-2017
05:51 PM
is it simply a machine with hadoop software to facilitate interaction with hdfs? yes.
... View more
02-03-2017
05:29 PM
1 Kudo
@Avijeet Dash
I would recommend reading the following link: http://www.solrtutorial.com/basic-solr-concepts.html First to answer your question, you cannot keep your data in HBase/HDFS and create an index in SOLR to search that data. SOLR will search its own index. Here is the concept: Data stored in SOLR is called documents (an analogy from database world is that each document is a row in a table). Before you can store data in SOLR, you will have to define a schema in a file called schema.xml (similar to a table schema in a database). This is where you specify whether your field (think like a column in a database) is indexed as well as stored. I know you understand index which is what SOLR uses to search. Bu what the hell is "stored". Well, are you only going to get back the indexed fields? Assume a document with 50 fields. May be you want to search only on 5 of the fields. And when you get the result back of your search, you probably want more than the indexed field. So you get back your stored fields. The more fields you store and index, the higher storage requirements. Read that link and you'll have a good idea. And to reiterate my earlier point, no, you cannot have data in HDFS/HBase and index from SOLR. SOLR is a complete solution. SOLR can use HDFS to store and index its own data, but it's not going to create an index on your HBase file or your ORC/Text etc files on HDFS.
... View more
02-03-2017
04:59 PM
1 Kudo
@Ganesan Vetri Like Michael mentions, files are not deleted immediately and rather moved to trash folder if you did not use "-skiptrash" option when deleting the folder. You can call the "hadoop fs -expunge" explicitly to empty trash. Even better, the folder you are trying to delete from has a subfolder called ".Trash". Just clear that up using "rm" command you'll reclaim the space. hdfs dfs -rm /path/to/trash/folder ///just like any other path.
See how Trash works for better understanding:
http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Space_Reclamation
... View more
02-03-2017
04:42 PM
1 Kudo
Yes, your understanding is correct. Automated and recommended way is to first install Ambari and then setup cluster through Ambari.
... View more
02-03-2017
04:29 PM
@Avijeet Dash I agree with you. It is much more reliable if after your streaming job, your data lands in Kafka and then written to HBase/HDFS. This decouples your streaming job from writing. I wouldn't recommend using Flume. Go with the combination of Nifi and Kafka.
... View more
02-02-2017
11:44 PM
@Divakar Annapureddy First thing first. Yes it is possible and supported. here is the link. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_hadoop-ha/content/ch_HA-HiveServer2.html This is going to be my personal preference and opinion, so take it based on how you do things in your organization. First, if it's not broken, why fix it. Second, and this may be important depending on your utilization and number of requests per second, especially if same zookeeper is being used for things like HBase or even Kafka (Kafka should have its own Zookeeper regardless). Zookeeper is very sensitive to timeouts. That is why one best practice is to give zookeeper its own dedicated disk. If your namenode is the only thing that's being managed by Zookeeper then it's fine but if you have HBase or Kafka already pointing to same Zookeeper, why add one more component especially if that component is working just fine? As for what others are doing, I am not sure about Zookeeper because I have only seen customers use some load balancer like F5. I can say confidently that Zookeeper approach is less deployed in industry probably because its a new feature.
... View more
02-02-2017
08:09 PM
I see following in your hbase-site.xml when I open it. <script data-x-lastpass="">
(function(){var c=0;if("undefined"!==typeof CustomEvent&&"function"===typeof window.dispatchEvent){var a=function(a){try{if("object"===typeof a&&(a=JSON.stringify(a)),"string"===typeof a)return window.dispatchEvent(new CustomEvent("lprequeststart",{detail:{data:a,requestID:++c}})),c}catch(f){}},b=function(a){try{window.dispatchEvent(new CustomEvent("lprequestend",{detail:a}))}catch(f){}};"undefined"!==typeof XMLHttpRequest&&XMLHttpRequest.prototype&&XMLHttpRequest.prototype.send&&(XMLHttpRequest.prototype.send= function(c){return function(f){var d=this,e=a(f);e&&d.addEventListener("loadend",function(){b({requestID:e,statusCode:d.status})});return c.apply(d,arguments)}}(XMLHttpRequest.prototype.send));"function"===typeof fetch&&(fetch=function(c){return function(f,d){var e=a(d),g=c.apply(this,arguments);if(e){var h=function(a){b({requestID:e,statusCode:a&&a.status})};g.then(h)["catch"](h)}return g}}(fetch))}})(); (function(){if("undefined"!==typeof CustomEvent){var c=function(a){if(a.lpsubmit)return a;var b=function(){try{this.dispatchEvent(new CustomEvent("lpsubmit"))}catch(k){}return a.apply(this,arguments)};b.lpsubmit=!0;return b};window.addEventListener("DOMContentLoaded",function(){if(document&&document.forms&&0<document.forms.length)for(var a=0;a<document.forms.length;++a)document.forms[a].submit=c(document.forms[a].submit)},!0);document.createElement=function(a){return function(){var b=a.apply(this, arguments);b&&"FORM"===b.nodeName&&b.submit&&(b.submit=c(b.submit));return b}}(document.createElement)}})();
</script>
... View more