About sunile_manjee

Manus · ‎12-28-2016

Hi Sunile , Could you please attach json structure which you have used to create entity in Apache atlas ? I want to create hive table entity along with two columns entity in it,How can I do that using REST API? Please post the example with full command and json body structire if you have.

qiwang · ‎08-24-2016

after start the sandbox run the following to check the status of Ambari. If any of them is not running, start it again and you should be fine ambari-agent status ambari-server status ambari-server start ambari-server start

sunile_manjee · ‎08-23-2016

@Ayub Pathan i will try soon. trying 100s of ways last night and none of the combos worked. will update today.

myoung · ‎08-22-2016

@Sunile Manjee Have you seen this article for tuning: https://community.hortonworks.com/articles/38591/hadoop-and-ldap-usage-load-patterns-and-tuning.html This article provides good background on the performance scaling of LDAP: http://researchweb.watson.ibm.com/people/d/dverma/papers/sigmetrics2001.pdf

jeff1 · ‎08-18-2016

Hi, It current is not. You can manually resize scale-up/down but cannot setup auto-scaling via Hortonworks Data Cloud. We are considering this a roadmap item. Thanks.

sunile_manjee · ‎08-17-2016

@jbarnett When you need to interface with the service (Hbase,hive,yarn,etc) then you decide to install the client node. typically you find in cluster setups you dedicate 1 node called "edge node" where you install all your client libraries. this then becomes your single entry point to run your services. you can add many edge node to scale out accordingly. as @Constantin Stanca explained it simply installed the client libraries for your specific version of hadoop and services. makes it very easy on end user. hope that helps.

sunile_manjee · ‎08-16-2016

I am a junkie for faster & cheaper data processing. Exactly why I love IaaS. My personal REAL WORLD experience with the typically IaaS providers has been generally slow on performance. Not to say hadoop/hbase/spark/etc jobs will not perform; however, you need to be familiar with what you're getting into and set realistic expectations. Recently I meet the IaaS vendor Their liquid metal offering which provides all the greatness which comes with bare metal on-prem installations but in the cloud. Options for bonded NICs & DAS had me at hello. I decided to run the same performance test I ran on AWS (article here) on bigstep. All the details of the scripts I ran are in that article. Just a quick note - these performance articles do not advocate for or against any specific IaaS provider. Nor does it reflect the HDP software. I simply want to run the repeatable processing test with near/similar IaaS hardware profiles and gather performance statistics. Interrupt the numbers as you wish. 1xMaster Node Hardware Profile CPU: 2 xIntel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz(8 x 2.40 GHz) RAM: 128 GB DDR3 ECCLocal storage disks: 1 NVMEDisk size: 745 GBNetwork bandwidth: 40 gbps 3xData Nodes Hardware ProfileCPU: 2 xIntel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz(8 x 2.40 GHz) RAM: 256 GB DDR3 ECCLocal storage disks: 12 HDDDisk size: 1863 GBNetwork bandwidth: 40 gbps Teragen results: 11 Mins 49 Secs I want to remain as objective as possible but WOW. That is simply one of the fastest teragen results I have ever seen. TeraSort results 51 Mins 12 secs Fastest I have seen on the cloud so far. On-prem with 1 additional node I was able to get it down to 40 mins. So 51 mins on 1 less nodes is pretty good. TeraValidate Results 4 mins 42 seconds This again was the faster performance I have seen on 1TB using teravalidate. I hope this helps with some basical insights into similar test I have performed so far on various IaaS providers. In the coming weeks/months I plan on publishing performance test result using azure and GCP. It is extremely important to understand zero performance tweaking as been done. Nor does this reflect how HDP runs on IaaS providers. This does not reflect anything about the IaaS provider as well. I simply want to run with minimum tweaking teragen/terasort/teravalidate test, with same parameters, and similar hardware profiles and document results. That's it. Keep it simple.

bob_heckel · ‎08-23-2017

I should have mentioned I was using VirtualBox v5.1.14

bleonhardi · ‎08-12-2016

just use doAs=true make sure only hive can read the warehouse folder and you are done. Hive cli can start but not access anything

emaxwell · ‎08-11-2016

@Sunile Manjee As @SBandaru states, you will need to make sure that proper group membership is maintained for the non-standard users. If you specify the users at cluster creation time, Ambari will take care of this for you. If you create them after the fact, then you will need to verify group membership. You may also need to modify the auth_to_local filters if the non-standard users are in AD/LDAP and you need to map them to local users. Another thing to consider is if you run the Ambari agent as non-root. There are a number of sudo rules that need to be put in place for the ambari user that allow execution of commands as the various service accounts for purposes of starting/stopping the services, installing packages, etc. You'll need to modify the customizable users sudo entry to suit your environment.

Online	Offline
Last Visited	‎05-25-2022 10:07 AM

Member Since	‎05-30-2018 10:40 PM
Last Visited	‎05-25-2022 10:07 AM
Posts	1,322
Kudos received	713

Cloudera Community

Re: Iterate over ADLS files using spark?

Re: Install NiFi CA service post nifi cluster inst...

Re: Which storage format is optimum for training m...

Re: Ambari custom alert failing

Re: df.cache() is not working on jdbc table

Re: create atlas entity via rest api

Re: Ambari Not Working after sandbox Poweroff and ...

Re: Atlas create type via rest api

Re: how many ldap servers for hadoop authenticatio...

Re: Autoscaling on Hortonworks Connected Data Clou...

Re: When to install Hadoop clients

Teragen, Terasort, and Teravalidate Performance te...

Re: What is the best way to shut down Hortonworks ...

Re: How to block Hive CLI access?

Re: Using non default hdp service accounts, what s...