About cstanca

cstanca · ‎04-18-2017

cstanca · ‎04-18-2017

@rbiswas What about adding new host and assign it at the same time to a specific rack? I'd like to avoid the host (data node) and then have to set the rack and restart again. Is there a way via Ambari UI?

cstanca · ‎04-16-2017

@Bala Vignesh N V Actually you can still use substr, but first you need to find your "[" character with instr function. As such, you would substr from the first character to the instr position -1. For special characters you have to use an escape character. Look here for instr and substr examples: http://hadooptutorial.info/string-functions-in-hive/#INSTRING This is how is done in all SQL-like, e.g. Oracle, SQL Server, MySQL etc.

cstanca · ‎04-07-2017

@Yong Boon Lim What did you try to accomplish creating that table stored in ORC format, but have the row stored in that format? Not related, but why would cast INT to INT too?

cstanca · ‎04-07-2017

@Pradhuman Gupta Apache Spark cannot do that out of the box. But you might be looking for is somemiddlewarethat can interface with Apache Spark and run, submit and manage jobs for you. Livy - REST server with extensive language support (Python, R, Scala), ability to maintain interactive sessions and object sharing. spark-jobserver - A simple Spark as a Service which supports objects sharing using so called named objects. JVM only. Mist - A service for exposing Spark analytical jobs and machine learning models as realtime, batch or reactive web services. Apache Toree - IPython protocol based middleware for interactive applications. Hortonworks recommends Livy. Also, read the last comment at: https://issues.apache.org/jira/browse/SPARK-2243 allowMultipleContexts has a very limited use for test only and can lead to the error you see.

cstanca · ‎04-07-2017

@Jay Goebel Known issue addressed in a later version of Ambari. If you restart Ambari that issue goes away: ambari-server restart Due to a timeout the thrift connection dies.

cstanca · ‎04-05-2017

Also, another option is to execute kdestroy and generate a new ticket and krb5cache. The documentation you need for HDP 2.4: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_dataintegration/content/hive-jdbc-odbc-drivers.html

cstanca · ‎04-05-2017

@Anil Bagga Problem when ading a key to hte keytab file with the kadmin, a error could be encountered: Unsupported key table format version number while adding key to keytab Cause local file to which you want to export the key (/etc/krb5.keytab) is in an incorrect format. This is usually because you have tried to create an empty file (using touch or similar commands) beforehand, and then export the key into it. Diagnosing the problem To verify that this is indeed the case, try running klist on the existing file to which you are attempting to export the key (you should run it in the source and target environment where the file is originated/used): example:sudo klist -k /etc/keytab It should return klist: Unsupported key table format version number while starting keytab scan Resolving the problem This usually happens when the local file to which you want to export the key (/etc/krb5.keytab) is in an incorrect format. The most common reason why this would happen is if you have tried to create an empty file (using touch or similar commands) beforehand, and then export the key into it. To verify that this is indeed the case, try running kliston the existing file to which you are attempting to export the key: sudo klist -k /etc/keytab klist: Unsupported key table format version number while starting keytab scan if this is not the cause, use another option: execute kdestroy and generate a new ticket and krb5cache”

cstanca · ‎04-03-2017

@Rohan Pednekar This is true also for any scan that requires evaluation before retrieving anything. I am not sure why this would be an HCC article. This is merely one paragraph of what could have been a well-written article about tips and tricks when dealing with HBase. I recommend looking at some of the featured articles in HCC and write that quality. This section you published could be very useful in a larger article. Thanks for your efforts.

cstanca · ‎03-30-2017

@Sree Kupp I need some clarification. You want to set num_executors to 46, which is a maximum set, but you want to use all the capacity of the cluster which is 96, but you did not mention anything about the RAM allocated to each container. For example, if you have 96 cores and 4 x 128 = 516 GB (simplistically because some cores and memory need to be allocated to other processes running on your cluster), and your memory allocated per container (one core per container) is 8 GB that would require for 96 containers = 96 * 8 = 768. You are short 252 GB which is about round(252/8)=32 containers. That means, even you set num_executors to 96, you could possibly spin 64. What happened when you set num_executors to 96. On the other hand, even if you set it to 96 it does not mean anything that the maximum is 96. Spark can decide based on several factors, e.g. data locality, Regarding your 47 vs. 46 set, we need to investigate a bit more what the extra container does.

Online	Offline
Last Visited	‎03-22-2019 03:12 AM

Member Since	‎03-16-2016 04:06 PM
Last Visited	‎03-22-2019 03:12 AM
Posts	707
Kudos received	1728

Cloudera Community

Re: 5th attempt at getting an answer to this quest...

Re: Trying to reinstall Apache NiFi 1.5 on HDF 3.1

Re: Is it mandatory that we should have exact moun...

Re: Alternate to smartsense

Re: Tracking of Hive tables metadata changes in re...

Does Ambari 2.4.x Standalone for Views require Nam...

Re: Rack Awareness Series 2

Re: How to remove '[' from a column

Re: Hive INSERT-SELECT: cannot be cast to org.apac...

Re: Does Spark allow multiple context to run in

Re: Connection error when executing in Hive View u...

Re: We have a requirement to connect to Hive datab...

Re: We have a requirement to connect to Hive datab...

Re: How to avoid full table scan while running hba...

Re: Setting num_executors on Spark2 in Ambari