About azeltov

mrizvi · ‎10-31-2016

HI @azeltov, I am trying to install R-studio on Hortonworks sandbox 2.5, running through the exception in verify installation step: initctl: Unable to connect to Upstart: Failed to connect to socket /com/ubuntu/upstart: Connection refused I have tried starting, stopping rstudio server, it shows the same message. PS: Since it is a docker container, 8787 port is not opened so I have configured /etc/rstudio/rserver.conf to use port 9000.

wojtekk · ‎09-06-2016

Hi, looks like simple error: I see s3a in your exception, but I think s3 or s3n should be there.

stevel · ‎11-07-2016

you shouldn't be seeing this on HDP2.5; everything needed to talk to S3A is on the classpath for spark already (we have done a lot of work on S3a performance for this release) Is the job actually failing, or is it just warning you that it couldn't create the s3a filesystem, but carrying on?

rich1 · ‎03-31-2016

On the exam you should always use Ambari when possible, especially for tasks like enabling NameNode HA.

shikhar_agarwal · ‎03-22-2017

Hi Artem, I'm currently stuck in a particular use case where in I'm trying to access Hive Table data using spark.read.jdbc as shown below: export SPARK_MAJOR_VERSION=2 spark-shell import org.apache.spark.sql.{DataFrame, Row,SparkSession} val connectionProperties = new java.util.Properties() val hiveQuery = "(SELECT * from hive_table limit 10) tmp" val hiveResult = spark.read.jdbc("jdbc:hive2://hiveServerHostname:10000/hiveDBName;user=hive;password=hive", hiveQuery, connectionProperties).collect() But when I check for the results in hiveResult it's just empty. Could you please suggest what's going on here? I know we can access Hive tables using HiveSesssion and I've successfully tried that but is it possible to run hive queries and access Hive data using the above method?

azeltov · ‎04-01-2016

@eorgadn You should wrap the geoDistance functions as hive UDF’s it will be a lot friendlier for most people that will want to use it in hive.

azeltov · ‎03-08-2016

@Artem Ervits your suggestion worked. This is what i ran to get it to run on my sandbox : yum install -y numpy

richard_xu · ‎04-25-2016

Ancil, I have question regarding: hive.tez.container.size is multiple of yarn.scheduler.minimum-allocation-mb, why so? if yarn.scheduler.maximum-allocation-mb = 24GB, yarn.scheduler.minimum-allocation-mb = 4GB, hive.tez.container.size=5B, would not Yarn smart enough to assign 5GB to a container to satisfy tez needs? Thanks, Richard

azeltov · ‎08-24-2016

@Alexander is there a full list of these hdi scripts available? If not how did you discover the ones above?

christian_proko · ‎05-20-2016

Hi @Neeraj Sabharwal, When will it become GA? Best, Christian

Online	Offline
Last Visited	‎08-14-2019 06:45 PM

Member Since	‎09-29-2015 01:18 AM
Last Visited	‎08-14-2019 06:45 PM
Posts	155
Kudos received	171

Cloudera Community

Re: LivyServer exception

Re: Ranger Dynamic query rewrite available for hiv...

Re: HDP 2.5 + Zeppelin 0.6 + LDAP : Interpreters a...

Re: How to import External Libraries for Livy Inte...

Re: Is Zeppelin in HDP 2.5 support multi-tenancy o...

Re: Running SparkR in RStudio using HDP 2.4

Re: HDP 2.4.0 and Spark 1.6.0 connecting to AWS S3...

Re: spark and s3 dependencies

Re: HDPCA Exam - Configure NameNode HA - Can you d...

Re: query hive tables with spark sql

Re: Geo Distance calculations in Hive and Java

Re: best way to install/integrate numpy scikit to ...

Re: Demystify Apache Tez Memory Tuning - Step by S...

Re: How to install Apache Zeppelin, R, Solr, and G...

Re: Apache Zeppelin and SparkR