Member since
04-17-2016
75
Posts
9
Kudos Received
0
Solutions
09-30-2021
12:25 PM
I too had the same issue. Check for zookeeper.chroot value. As zookeeper and Kafka have tight integration. In my case i had given /kafka.
... View more
05-16-2019
04:13 AM
@Jeeva Jeeva Try with the below queries: select count(*) from <db>.<tab_name>
where date in (select max(date) from <db>.<tab_name> --get max date from table) (or) select count(*) from <db>.<tab_name>
where date = (select max(date) from <db>.<tab_name>)
... View more
03-23-2017
07:31 PM
Hi Mahesh, There is no way to list all members of supergroup. But, we can run for individual user to see if they are belong to the group by using the below command. hdfs groups <username> Once again, thanks for time .
... View more
03-28-2017
02:38 AM
EventTime timezone fix is available in Ranger 0.7.0. https://issues.apache.org/jira/browse/RANGER-1249
... View more
03-01-2017
01:31 AM
Netezza list-databases command is not supported, please see the sqoop user guide for special note https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_example_invocations_8 I am curious whether using --direct mode implemented the list-databases option, please try and if you can report back. Note This only works with HSQLDB, MySQL and Oracle. When using with Oracle, it is necessary that the user connecting to the database has DBA privileges.
... View more
01-25-2017
05:25 PM
Hi Ravi, Thank you very much for your prompt reply and it helped me a lot. I understand the operation now. Once again thanks
... View more
12-29-2016
12:16 AM
3 Kudos
These tools are used similarly with any software SDLC, just you will be developing software being executed on a Hadoop/Spark cluster. You can still build your jars the same way and use GIT as your source code repository. You will be submitting the job for execution in a distributed cluster. However, there are pseudo clusters for development. For example you can use hadoop mini cluster: https://github.com/sakserv/hadoop-mini-clusters
A good reference on how to use this mini cluster for testing: http://www.lopakalogic.com/articles/hadoop-articles/hadoop-testing-with-minicluster/ For Spark development you could use Spark standalone.
... View more
11-16-2016
10:55 PM
6 Kudos
@Jeeva Jeeva Multithreading Programming Model and MapReduce Programming Model are based on fundamentally different principles and both are meant to solve different kinds of data storage and processing problems. Multithreading is based on Parallelization of Processing where as Hadoop takes power by Parallelization of Data. If you assume that the Hadoop ecosystem is only MapReduce and Spark batch, then your understanding is correct . However, the ecosystem includes also real-time streaming tools like Apache Storm which uses multi-threading. However, modern tools handle all these programmatic needs for multi-threading by architecture/design. Their focus is scalability by architecture and design and less by laborious programming efforts. References: http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/ https://www.safaribooksonline.com/blog/2014/01/06/multi-threading-storm +++ If it helped, pls vote/accept best answer
... View more
10-21-2016
10:03 PM
@Jeeva Jeeva You're probably best off posting that as another question (both to get it answered and so it's more searchable). I don't have anything in hand at the moment. Best.
... View more
04-29-2016
09:08 PM
Thanks for your reply and i will try this and let you know if it works. I greatly appreciate your your effort and time.....
... View more