About pacosoplas

pacosoplas · ‎05-26-2016

Hi: I change my decimal(9,2) in hive to Double, and now its working

lubinlemarchand · ‎05-27-2016

Fixed it. I removed the atlas.pid file and restarted the atlas-server. rm /var/run/atlas/atlas.pid python /usr/hdp/current/atlas-server/bin/atlas_start.py

pacosoplas · ‎10-16-2016

Hi: I have edit vi /usr/hdp/current/zeppelin-server/lib/conf/shiro.ini: [urls] /api/version = anon #/** = anon /** = authcBasic [users] admin = admin hdfs = hdfs and restart zeppelin, but the login doesnt appear, just the anonimous user. I need anything else ???

jyadav · ‎05-25-2016

@Roberto Sancho Well looks like you are running Hive metastore 🙂

bleonhardi · ‎05-20-2016

While Hive is perfect for analytical queries and is amazing for highly parallel workloads with lots of parallel queries, it is not as fast for small queries as traditional databases yet. You will not get queries faster than 2-3 seconds in total even under perfect circumstances. This is due to the architecture. Rule of thumb: - If Tez has to create a new session ( application master ), i.e. a query on a cold system, you can expect 10-15s pre time. You can fix this by pre-creating sessions. However that takes a bit of the cluster even if you don't need it. - If Tez has to create task containers you can expect 2-3s extra. Tez can reuse containers and there is also prewarm to precreate containers but it's tuning depends a lot on your usecase. If you don't know what you are doing you can make it worse - In general the Hiveserver has a bit of overhead ( around 1s ) for plan compilation communication with the metastore etc. So yes at the moment you will not get faster than 2-3s, realistically 4-5s. If you need sub second responses look at Phoenix for example. However things will soon get better for these short queries: - LLAP is already available as a tec preview ( long running processes that have an ORC data cache and remove the startup needs. - Hive will have an Hbase backed metastore which should speed up the hiveserver2 and more. In short look out for this space.

amod0017 · ‎10-19-2016

I tried doing that but it's not working for me.

pacosoplas · ‎05-13-2016

Hi finally the problem was about the directory permission /var/run/ambari-server on the namenode I did: chown -R ambari:ambari /var/run/ambari-server

ravi1 · ‎05-05-2016

Spark Thrift Server is similar to HiveServer2, you will install it on a node and use that as JDBC URL. You could have more than one in HA and load balancing scenarios but other than that, you will only have one of it.

pacosoplas · ‎05-05-2016

Hi: finally its working with this code: Sys.setenv(SPARK_HOME="/usr/hdp/current/spark-client/") .libPaths(c(file.path(Sys.getenv("SPARK_HOME"),"R","lib"),.libPaths())) library(SparkR) #sparkR.stop() sparkR.stop() sc <- SparkR::sparkR.init(master = "yarn-client", sparkEnvir = list(spark.driver.memory="4g")) hiveContext <- sparkRHive.init(sc)

aervits · ‎05-05-2016

hbase is great for random key lookups, I've worked on a project where wordcloud powered by HBase worked just fine. If you have a dashboard, HBase or perhaps Phoenix works pretty well behind it.

Online	Offline
Last Visited	‎11-16-2019 11:43 AM

Member Since	‎09-24-2015 09:57 AM
Last Visited	‎11-16-2019 11:43 AM
Posts	527
Kudos received	136

Cloudera Community

Re: hdfs block corrupt

Re: MARIDB & MYSQL & HDP2.5

Re: kafka producer error I/O

Re: spark com.databricks.spark.csv doesnt work

Re: many alert after add new host from ambari

Re: Decimal types Apache Drill

Re: critical error for hive and atlas

Re: APACHE ZEPPELIN ON HDP 2.4.2

Re: Apache Drill. connecting metastore

Re: session tez long time

Re: I cant add new services into ambari

Re: Adding Hosts to a Cluster

Re: Spark Thrift Server install nodes

Re: sparkR connect to hadoop cluster

Re: hbase insert from pig