Member since
09-18-2015
216
Posts
208
Kudos Received
49
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
945 | 09-13-2017 06:04 AM | |
1896 | 06-27-2017 06:31 PM | |
1876 | 06-27-2017 06:27 PM | |
8334 | 11-04-2016 08:02 PM | |
8980 | 05-25-2016 03:42 PM |
12-16-2015
06:39 PM
1 Kudo
@Vitor Batista
Cool. You don't have enough resources for so many services to efficiently run on such a small VM. I had another customer facing similar issue in past. Please accept this answer so that we can close this. As a best practice, always accept the answer once that answer is acceptable and resolves the issue.
... View more
12-16-2015
06:29 PM
@Vijay Srinivasaraghavan First of all Oozie doesn't have any slave components, it has one master component Oozie Server which should be placed on one of the master nodes and then clients which should be placed on edge/client node. And thats the similar case with Hive which doesn't have any slave components. As a general recommendation, you can start with having HBase Region Server, HDFS Datanode and YARN NodeManager on all slave nodes but usually over time once you know and understand your workload, use cases and compute requirements, this would evolve.
... View more
12-16-2015
06:26 PM
@rmian Not sure if it is resolved. Can you please answer to this if it is resolved so that we can close this?
... View more
12-16-2015
06:18 PM
3 Kudos
Looks like there are a couple of good youtube tutorials for setting up HDP on AWS along with the blog which @Neeraj Sabharwal mentioned :- http://hortonworks.com/blog/deploying-hadoop-clust... Youtube videos:- https://www.youtube.com/watch?v=hGwjwBOCa2I&index=... https://www.youtube.com/watch?v=6-RY4Ll6ABU&list=P...
... View more
12-16-2015
05:46 PM
1 Kudo
@Cui Lin I am not R guy but this would give you a good starting point depending on you want to use RevR, R or Python. RHbase tutorials --> https://github.com/RevolutionAnalytics/RHadoop/wik... http://www.odbms.org/2015/06/intro-to-hbase-via-r-... http://radar.oreilly.com/2014/08/scaling-up-data-f... PandaHbase --> https://github.com/livingstonese/pandas-hbase
... View more
12-16-2015
04:54 PM
2 Kudos
@Vitor BatistaIn terms of CPU consumption, how many CPUs are there on this machine? There are many components on this same machine as shown in attached snapshot (12 service components) which would be reason for 100% CPU consumption. I would be really surprised if AMS itself consumes all CPUs as you have mentioned. You can validate the same theory by moving AMS collector to some other node with lesser number of service components. On issue of HBase, while you have not installed HBase on the cluster, Ambari Metrics Service internally installs single node HBase cluster on the machine where AMS Collector is installed. And if you kill this Hbase process, AMS service would fail as well as you witnessed. Hope this helps!!
... View more
12-16-2015
04:35 PM
2 Kudos
@Virendra Agarwal A custom SerDe will work. While you can use custom serde like one explained in this article from @Neeraj Sabharwal: https://community.hortonworks.com/articles/972/hive-and-xml-pasring.html Look at below stack overflow discussion as well: http://stackoverflow.com/questions/27583736/lines-... But if at all possible, it would be a great idea if you could migrate away from character separated files to a modern format like ORC or avro. You will gain in performance, benefit from complex structures and have a much more future proof data format. Load raw data into external table using SerDe but try to get the final resting place a managed table with an advanced file format, you'll be much happier in the long run.
... View more
12-16-2015
04:10 PM
4 Kudos
@Aidan Condron You can do it in multiple ways as following. It depends on your requirement. 1. If your data is already in TSV or CSV format, skip this step and use the included ImportTsv utility and bulkload. See http://hbase.apache.org/book.html#arch.bulk.load for details. 2. You can use Phoenix for the same if using Phoenix with HBase. https://phoenix.apache.org/bulk_dataload.html 3. Other option would be to use HiveHBase Storage Handler to do the same. Refer below for the same. http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-...
... View more
12-16-2015
03:53 PM
You responded while I was typing 🙂
... View more