About heta_desai

heta_desai · ‎03-31-2018

@Aditya Sirna In Howmany days the result of exam is being declared ?

RahulSoni · ‎04-01-2018

@heta desai Did the answer help in the resolution of your query? Please close the thread by marking the answer as Accepted!

sandyy006 · ‎02-01-2018

@heta desai Please refer Getting started in this document : https://docs.hortonworks.com/HDPDocuments/HDCloudAWS/HDCloudAWS-1.16.5/index.html

arald · ‎01-29-2018

While I am not aware of any formula, there is at least a guide available: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_cluster-planning/index.html In principal it says about 24 - 48 GB per data node. For the name node 64GB are supposed to deal with 100 million files. Otherwise my recommendation would be to go for a real use (not test or demonstration) at least with 3 master nodes and 12 slave nodes, but increase the slave nodes as needed for your use. A typically use is 2GB RAM for one MR task, so that can provide a rule of thumb on how many slave node you should add. To be more precise on the sizing, the expected use should be given, ie. will you just use MR, or maybe HBase, or do you need stream processing etc... The more applications running on the slaves, the more RAM you probably need besides the MR task, so it would result in additional nodes. It is also possible to have separated clusters for the stream processing and the hadoop storage.

egarelnabi · ‎06-22-2017

@heta desai Take a look at the below demo using Nifi to fetch the tweets and Spark and Zeppelin for the analysis. https://community.hortonworks.com/articles/30213/us-presidential-election-tweet-analysis-using-hdfn.html

ssahi · ‎05-29-2017

Hi @heta desai You could be using HDF (NiFi) as your primary ingestion tool and not have to worry about the other options necessarily. That said, sqoop is primarily used to move data from an existing RDBMS to Hadoop (or vice versa). Flume was the main tool previously to ingest flat files, csv, etc, but has fallen out of favour and is often being replaced with HDF/NiFi now. Kafka is a distributed messaging system which can be used as a pub/sub model for data ingest, including streaming. So all three are a bit different. The right tool for the job depends on your use case, but as I said, HDF/NiFi can pretty much cover the gambit, so if you are starting out now, you may want to look at that first. Here is another good write up on the same subject: https://community.hortonworks.com/questions/23337/best-tools-to-ingest-data-to-hadoop.html As always, if you find this post useful, please accept the answer.

aervits · ‎05-02-2017

Your script location is wrong, start grunt in the directory with the script or provide full path to the script in the exec command

rajkumar_singh · ‎04-20-2017

no need to store the result in RDBMS, these libraries are javascript library, you can pull the result using java jdbc program https://github.com/rajkrrsingh/HiveServer2JDBCSample, set in some variable and plot using these libraries.

jsensharma · ‎04-05-2017

@heta desai Copying snippet from: http://stackoverflow.com/questions/16546040/store-images-videos-into-hadoop-hdfs It is absolutely possible without doing anything extra. Hadoop provides us the facility to read/write binary files. So, practically anything which can be converted into bytes can be stored into HDFS(images, videos etc). To do that Hadoop provides something called asSequenceFiles. SequenceFile is a flat file consisting of binary key/value pairs. The SequenceFile provides a Writer, Reader and Sorter classes for writing, reading and sorting respectively. So, you could convert your image/video file into a SeuenceFile and store it into the HDFS. . Some Examples: http://www.tothenew.com/blog/how-to-manage-and-analyze-video-data-using-hadoop/ https://content.pivotal.io/blog/using-hadoop-mapreduce-for-distributed-video-transcoding .

heta_desai · ‎03-31-2017

Thank you.!

Online	Offline
Last Visited	‎12-02-2019 05:09 AM

Member Since	‎03-21-2017 05:34 AM
Last Visited	‎12-02-2019 05:09 AM
Posts	197
Kudos received	6

Cloudera Community

Re: how to open druid UI ?

Re: after deleting user and group from openLDAP se...

Re: ambari-server sync-ldap: Exiting with exit co...

Re: certification issue

Re: how to perform Log file analysis in hadoop ?

Re: how to create hortonworks service in AWS ?

Re: how to decide howmany worker nodes we should h...

Re: spark sentiment analysis

Re: difference between kafka and sqoop

Re: command to execute pig script from the command...

Re: i want to create visual report from hive resul...

Re: how hadoop stores unstructured data like image...

Re: if the scheduler is FIFO how the size of conta...