Member since
02-12-2016
102
Posts
117
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
14041 | 03-15-2016 06:36 AM | |
16011 | 03-12-2016 10:04 AM | |
3442 | 03-12-2016 08:14 AM | |
1047 | 03-04-2016 02:36 PM | |
2008 | 02-19-2016 10:59 AM |
03-12-2016
10:47 AM
3 Kudos
I got below answer: Sometimes
there is data in a tuple or bag and if we want to remove the level of
nesting from that data then Flatten modifier in Pig can be used.
Flatten un-nests bags and tuples. For tuples, the Flatten operator
will substitute the fields of a tuple in place of a tuple whereas
un-nesting bags is a little complex because it requires creating new
tuples.
... View more
03-12-2016
10:38 AM
1 Kudo
@Neeraj Sabharwal, thanks for quick reply.
... View more
03-12-2016
10:23 AM
4 Kudos
Hi, Can anyone explain what is use of Flatten in Pig?
... View more
Labels:
- Labels:
-
Apache Pig
03-12-2016
10:04 AM
2 Kudos
I got below answer: In
SMB join in Hive, each mapper reads a bucket from the first table and
the corresponding bucket from the second table and then a merge sort
join is performed. Sort Merge Bucket (SMB) join in hive is mainly
used as there is no limit on file or partition or table join. SMB
join can best be used when the tables are large. In SMB join the
columns are bucketed and sorted using the join columns. All tables
should have the same number of buckets in SMB join.
... View more
03-12-2016
10:01 AM
@Artem Ervits, thanks for reply and link.
... View more
03-12-2016
09:19 AM
3 Kudos
Hi, Can anyone explain What is Sort Merge Bucket (SMB) Join in Hive? When it is used?
... View more
Labels:
- Labels:
-
Apache Hive
03-12-2016
09:05 AM
1 Kudo
@Artem Ervits, thanks for reply and sharing link.
... View more
03-12-2016
09:04 AM
1 Kudo
@Artem Ervits, thanks for sharing this link.
... View more
03-12-2016
08:14 AM
3 Kudos
I got below answer: Apache
Flume can be used with HBase using one of the two HBase sinks –
HBaseSink
(org.apache.flume.sink.hbase.HBaseSink) supports secure HBase
clusters and also the novel HBase IPC that was introduced in the
version HBase 0.96. AsyncHBaseSink
(org.apache.flume.sink.hbase.AsyncHBaseSink) has better performance
than HBase sink as it can easily make non-blocking calls to HBase. Working
of the HBaseSink – In
HBaseSink, a Flume Event is converted into HBase Increments or Puts.
Serializer implements the HBaseEventSerializer which is then
instantiated when the sink starts. For every event, sink calls the
initialize method in the serializer which then translates the Flume
Event into HBase increments and puts to be sent to HBase cluster. Working
of the AsyncHBaseSink- AsyncHBaseSink
implements the AsyncHBaseEventSerializer. The initialize method is
called only once by the sink when it starts. Sink invokes the
setEvent method and then makes calls to the getIncrements and
getActions methods just similar to HBase sink. When the sink stops,
the cleanUp method is called by the serializer.
... View more
03-12-2016
08:08 AM
1 Kudo
@Rohan Pednekar, thanks for sharing this link.
... View more