About rakesh_an1992

mlamairesse · ‎09-26-2018

@Srikanth t The easiest approach is to use lateral views. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView It allows you to split an array into multiple line. 1. Let's create an array from the items in your column "items" select key, split(items, ',') as valArray from test result +------+---------------------------------------+--+ | key | _c1 | +------+---------------------------------------+--+ | 22 | ["1001 abc"," 1002 pqr"," 1003 tuv"] | | 33 | ["1004 def"," 1005 xyz"] | +------+---------------------------------------+--+ 2. Now let's use lateral view to split these items into lines (using "trim" to clean up the space) select key, trim(uniqueVal) from( select key, split(items, ',') as valArray from test ) a lateral view explode(a.valArray) exploded as uniqueVal ; +------+-----------+--+ | key | _c1 | +------+-----------+--+ | 22 | 1001 abc | | 22 | 1002 pqr | | 22 | 1003 tuv | | 33 | 1004 def | | 33 | 1005 xyz | +------+-----------+--+ 3. Finally let's use split again to get separate values. select key, split(trim(uniqueVal), ' ')[0], split(trim(uniqueVal), ' ')[1] from( select key, split(items, ',') as valArray from test ) a lateral view explode(a.valArray) exploded as uniqueVal ; +------+-------+------+--+ | key | _c1 | _c2 | +------+-------+------+--+ | 22 | 1001 | abc | | 22 | 1002 | pqr | | 22 | 1003 | tuv | | 33 | 1004 | def | | 33 | 1005 | xyz | +------+-------+------+--+ Note : I used the following to create the table create table test ( key string, value string ) STORED AS ORC ; INSERT INTO test (key, value ) VALUES (22, '1001 abc, 1002 pqr, 1003 tuv'), (33, '1004 def, 1005 xyz');

schhabra1 · ‎04-19-2018

@Rakesh AN If above information helped you, Could you please accept answer?

JordanMoore · ‎02-22-2018

@Rakesh AN I have worked for at least three companies trying to follow Agile/Scrum, and their development cycles of code does follow it. It's hard to upgrade hundreds of Hadoop nodes and software versions, make sure they all work with other components of the cluster, all without breaking other pieces in two-week sprints, though. Stand up meetings are all about perception management between team members and management. It again, has no special relationship or difference whether it is Hadoop development, web or mobile development, etc.

JordanMoore · ‎12-14-2017

@Rakesh AN I have not used Flume in a distributed fashion, but whatever agent you choose, it tails the logs from the agent on that server, then ships them to the configured sink destinations. One agent per server makes it collect from different servers. Flume is near real-time, since it is configured with a batch size. It's not clear what doubt you have... Can you please explain how you've configured your Flume agents, and the issues you are experiencing? The Flume documentation is fairly straightforward

rakesh_an1992 · ‎12-13-2017

Thanks for the clarification @Shu

rakesh_an1992 · ‎12-11-2017

@Jay Kumar SenSharma What should be the cluster size in Development, Test and Production environment if i want to process 10 TB of Data on a daily basis? Also, how to manage this using Oozie on daily basis ?

rakesh_an1992 · ‎03-06-2018

@Shu How is number of Mappers/reducers decided for a given query will be decided in runtime ? Is it dependet on how many number of Joins or group by or order by clauses that are used in the query ? If yes, then please let me know how many mappers and reducers are launched for the below query. select name, count(*) as cnt from test group by name order by name;

Online	Offline
Last Visited	‎10-28-2019 02:51 AM

Member Since	‎12-06-2017 06:06 AM
Last Visited	‎10-28-2019 02:51 AM
Posts	15
Kudos received	3

Cloudera Community

Re: Hive Split for columns

Re: Writing a Map reduce code with larger and smal...

Re: How Hadoop works with agile and scrum methodol...

Re: what are the different sources used in real-ti...

Re: Can I change the contents of a file present in...

Re: what is the production environment in Hadoop? ...

Re: Hive queries use only mappers or only reducers