Member since
12-06-2017
15
Posts
3
Kudos Received
0
Solutions
09-26-2018
07:05 AM
Hi @Srikanth You can try using the below approach : create table test1(store id, items STRING);
insert into table test1 values(22, '1001 abc, 1002 pqr, 1003 tuv');
insert into table test1 values(33, '1004 def, 1005 xyz'); I have created a sample table in Hive and executed below query to get the expected result. select store, split(item,' ')[0] as item_id,split(item,' ')[1] as item_name from test1 lateral view explode(split(items,', ')) vExplodeTbl as item;
... View more
04-05-2018
10:04 AM
I have a large file of 5 GB which has detailed information about an Employee and also, i have 1 small file with 2 MB which has only employee names. I want to extract the employee names from the smaller file and do analysis on larger file using employee name. How can I do this in Map reduce ?
... View more
Labels:
- Labels:
-
Apache Hadoop
03-06-2018
05:50 PM
@Shu How is number of Mappers/reducers decided for a given query will be decided in runtime ? Is it dependet on how many number of Joins or group by or order by clauses that are used in the query ? If yes, then please let me know how many mappers and reducers are launched for the below query. select name, count(*) as cnt from test group by name order by name;
... View more
02-22-2018
04:34 AM
@Jordan Moore Thanks for the update !!! We are also working in the same fashion as you said, but I thought that other companies might be following agile/scrum methodologies for the Hadoop Development. Also, I have one more question; How is stand-up meetings or client interaction process done in big Data projects ?
... View more
02-21-2018
04:38 AM
Hi, I'm trying to implement Hadoop project and i'm researching on how the SDLC workflow involved in the Hadoop project. Thanks,
... View more
Labels:
- Labels:
-
Apache Hadoop
12-14-2017
12:46 PM
@Jordan Moore Thanks for the suggestion. Can you please let me know how log from different sever collected in real-time projects ? If you know any link, you can share.
... View more
12-13-2017
12:50 PM
1 Kudo
Thanks for the clarification @Shu
... View more
12-12-2017
10:50 AM
1 Kudo
Hi, I want to know how Flume is very much useful in streaming log files in real-time. I have practiced to import files through 'exec' command but I want to know what are the different sources used in Flume streaming in real-time projects. Please help me out in clearing this doubt. Thanks,
... View more
Labels:
- Labels:
-
Apache Flume
-
Apache Hadoop
12-12-2017
10:43 AM
1 Kudo
@Shu I have one doubt - If we change the contents of a file, will this affect to the metadata information stored on the Namenode. what happens if we keep on appending the data to the same file on daily basis? Also, what if we append large files, will this reduces performance ? Do you recommend to appending data the existing file or creating the new file ? Thanks,
... View more
12-11-2017
10:31 AM
Hi, I have a text file stored in HDFS and I want to append some rows into it. How can I resolve complete this task ? Thanks,
... View more
Labels:
- Labels:
-
Apache Hadoop