About kkhan

kkhan · ‎10-04-2019

In case, you get below error, make sure you use Nifi host FQDN in API call and NOT IP address. Also, make sure DNS is configured correctly. <body><h2>HTTP ERROR 401</h2> <p>Problem accessing /nifi-api/access/kerberos. Reason: <pre> Unauthorized</pre>

kkhan · ‎02-17-2017

try increasing heap size for metastore. Make sure DB connection is working fine.

kkhan · ‎02-16-2017

@rahul gulati Which version of Ambari? Is your cluster kerberized? Is Ambari SSL enabled? Is this on local Ambari Cluster? Can you share all setting from your FileView

kkhan · ‎12-02-2016

@Manish Gupta try adding hive-metastore.jar as well in squirrel jar list.

kkhan · ‎11-23-2016

Depends on what arguments you are providing to the hash function. If your argument values are unique, you would most likely get unique value from hash. Keep in mind hive hash function return int (which is 32bit) so you may see -ve numbers as well. You can use something like reflect('java.util.UUID','randomUUID') to generate uniqueID or comeup with some unique code. I would not suggest using hash function, if you want to generate unique ids.

kkhan · ‎08-03-2016

Try running set; command. It should display all the values for all the variables in current session.

kkhan · ‎07-29-2016

Hi Upendra, The recommendation is to use VARCHAR and Integer Types (TINYINT, SMALLINT, INT, BIGINT) where ever possible instead of using String. In hive String is treated as VARCHAR(32762). So, if you have data that is not more than 50 characters in length, using string here will have some overhead. Same for integer types. Hope this helps.

kkhan · ‎07-28-2016

Just starting to understand Spark memory management on yarn and got few questions that I thought would be better to ask experts here. 1. Is there a way to restrict max size that users can use for Spark executor and Driver when submitting jobs on Yarn cluster? 2. What the best practice around determining number of executor required for a job? Is there a max limit that users can be restricted to? 3. How RM handles resource allocation if most of the resources are consumed by Spark jobs in a queue? How preemption is handled?

kkhan · ‎07-26-2016

You can deploy Master Slave KDC. That will provide HA. I have done this before. You can setup replication between master and slave. http://www.tldp.org/HOWTO/Kerberos-Infrastructure-HOWTO/server-replication.html HTH

kkhan · ‎07-14-2016

If this fixed your issue, can you accept this as Answer. It would help others in community.

Online	Offline
Last Visited	‎01-19-2021 12:46 PM

Member Since	‎04-06-2016 08:12 PM
Last Visited	‎01-19-2021 12:46 PM
Posts	47
Kudos received	7

Cloudera Community

Re: Connecting to Hive from SquirreL SQL client

Re: Hive: hash function unique values

Re: Is there a recommended architecture/methods to...

Re: Use a Kerberos token for accessing NiFi REST A...

Re: Hive metastore goes down without any error log...

Re: Unable to open File View from Ambari UI

Re: Connecting to Hive from SquirreL SQL client

Re: Hive: hash function unique values

Re: Hive ORC defaults read out

Re: Hive STRING vs VARCHAR Performance

Is it possible to define max memory size for Spark...

Re: Is there a recommended architecture/methods to...

Re: Select * from and & Select count (*) different...