Member since
08-23-2016
62
Posts
44
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
20763 | 11-13-2016 06:24 PM | |
897 | 11-13-2016 04:21 PM | |
1682 | 11-13-2016 03:58 PM | |
8968 | 10-05-2016 04:45 AM | |
1906 | 09-29-2016 06:42 AM |
08-20-2019
04:40 AM
The link https://community.mapr.com/docs/DOC-1215 is not working now.
... View more
01-05-2017
03:07 PM
Hi I was able to resolve the issue,the disk utilization in local directory (where logs and out files are created) in one of the node was more than the yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage setting.
I freed up some space and also set the max-disk-utilization-percentage to much higher value. Thanks Aparna
... View more
01-28-2017
06:20 AM
For the backtick approach, you might want to try `header`.`timestamp`
... View more
07-01-2017
03:17 PM
The right way to think about LATERAL VIEW is that it allows a table-generating function (UDTF) to be treated as a table source, so that it can be used like any other table, including selects, joins and more.
LATERAL VIEW is often used with explode, but explode is just one UDTF of many, a full list is available in the documentation.
To take an example:
select tf1.*, tf2.*
from (select 0) t
lateral view explode(map('A',10,'B',20,'C',30)) tf1
lateral view explode(map('A',10,'B',20,'C',30)) tf2;
This results in:
tf1.key
tf1.value
tf2.key
tf2.value
A
10
A
10
A
10
B
20
A
10
C
30
B
20
A
10
(5 rows were truncated)
The thing to see here is that this query is a cross product join between the tables tf1 and tf2. The LATERAL VIEW syntax allowed me to treat them as tables. The original question used "AS" syntax, which automatically maps the generated table's columns to column aliases. In my view it is much more powerful to leave them as tables and use their fully qualified table correlation identifiers.
These tables can be used in joins as well:
select tf1.*, tf2.*
from (select 0) t
lateral view explode(map('A',10,'B',20,'C',30)) tf1
lateral view explode(map('A',10,'B',20,'C',30)) tf2 where tf1.key = tf2.key;
Now we get:
tf1.key
tf1.value
tf2.key
tf2.value
A
10
A
10
B
20
B
20
C
30
C
30
... View more
07-03-2018
05:25 AM
In hbase, create a namespace and then create a table to avoid this error.
... View more
09-26-2016
06:05 PM
6 Kudos
@Arkaprova Saha It depends on you feel about yourself and your future. If you consider yourself a software engineer that has solid Java background and wants to deliver highly optimized and scalable software products based on Spark then you may want to focus more on Scala. If you are more focused on data wrangling, discovery and analysis, short-term use focused studies, or to resolve business problems as quick as possible then Python is awesome. Python has such a large community and code snippets, applications etc. Don't get me wrong, but Python could also be used to deliver enterprise-level applications, but it is more often to use Java and Scala for highly optimized. Python has some culprits, which we will not debate here. Anyhow, I would say that Python is kind of a MUST HAVE and Scala is NICE TO HAVE. Obviously, this is my 2c and I would be amazed that any of these responses in this thread is the ANSWER.
... View more
09-22-2016
02:39 PM
1 Kudo
Hi @Mahesh Mallikarjunappa, The Flume agent is typically listen to a logfile etc until it's aborted by an operator so that is where the"long-lived process" comes in. For more info on Flume please visit - https://flume.apache.org/FlumeUserGuide.html /Best regards, Mats
... View more