Member since
06-29-2016
81
Posts
43
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1108 | 03-16-2016 08:26 PM |
09-02-2022
04:52 AM
May I 1 question Please.. @gkeys If I buy HDInsight as PAAS, what will be role and responsibility as Hadoop Admin, or Admin job role will be removed? As we can't upgrade Hadoop versions, service will be 1 click ready. what else remaining? Performance tuning can be done by developer directly.. Hope you understand my worrying concern...
... View more
08-07-2019
05:42 PM
With the Advent of heterogeneous storage for hdfs can we now look at Nas in a new light .. Potentially we could lable Nas mounts on a data nodes as archive storage and have hdfs move data in there when it becomes cold I would like to hear opinions on this
... View more
01-11-2017
08:46 PM
YARN is designed for Hadoop and is very mature and stable. Mesos is very new, written in C++, has CPU scheduling. This presentation is pretty good. http://www.slideshare.net/mKrishnaKumar1/mesos-vs-yarn-an-overview
... View more
01-03-2017
05:49 PM
@learninghuman If this answer helps, please accept it. Otherwise, I'd be happy to answer any remaining questions you have.
Thanks! _Tom
... View more
12-30-2016
02:25 PM
@learninghuman You can read more about Hadoop Azure Support: Azure Blob Storage in the Apache Doc for Hadoop 2.7.2. You'd need to check with the vendors behind the other distros to see whether or not they support this or not.
... View more
08-25-2016
08:50 AM
@Tom McCuch Thanks a lot for the views and inputs. It definitely helps.
... View more
06-29-2016
01:00 PM
@Benjamin Leonhardi Thanks, makes sense
... View more
05-13-2016
07:21 PM
1 Kudo
Yes. Once you specific STORED AS ORC, OrcSerde is what is used which ignores those. Your SerDe can decide which of them in create table script can be used.
... View more
07-11-2018
02:23 PM
Thank you @Krish E, did you sort it out now? I am having the same issue. What is your table's size?
... View more
03-25-2016
08:01 PM
@Joseph Niemiec You mentioned "Left outerjoin and test for null in the WHERE is probably better for scaling then UNION DISTINCT if you are worried about a reducer problem. Same join syntax as the example below..." How left outer join avoids reducer (unless its a map join)? Do you recommend left outer join than union distinct? And in the point "We have found a fun case where if you try to use this to dedupe or clean.....", so my understanding is that if a partition has 5 records which are duplicates (the initial master load already had it), there is no way to remove unless a 6th records which is a duplicate of those 5 records come in the staging load. Am i right? If so, what is your recommendation to remove duplicates in the initial load itself?
... View more