About cstanca

cstanca · ‎12-28-2016

@Ashnee Sharma There was an issue and for that you submitted a question separately. It is good to document here as well, for other sake that may be encounter a similar problem. Please post it. I found it. Based on the original response, you encountered an issue, then you asked this question: https://community.hortonworks.com/questions/74245/how-to-disable-pagination-for-ambari-ldap.html

cstanca · ‎12-28-2016

@Brad Bukacek Jr By design, the HBase REST server returns content encoded response with base64. So all your content, like the column family, the qualifier and the raw content will be encoded. You just need to create a custom JSON deserializer. Here is an awesome blog about this subject: https://blog.layer4.fr/2016/11/16/hbase-rest-api-knox-java/ There is a special section about your problem.

cstanca · ‎12-27-2016

See link below to learn why s3a is a better option than s3n, but that may not be the cause for your issue. https://wiki.apache.org/hadoop/AmazonS3

cstanca · ‎12-26-2016

@Dmitry Otblesk Login to Ambari UI first then click on YARN link on the left nav bar then on the QuickLinks and chose Resource Manager UI link. You could also go directly to your Resource Manager UI if you know the host where the Resource Manager service runs also the port. You should also take advantage of Hive Tez View to see all the tasks executed and time needed for each. While you execute the query watch the execution in Resource Manager UI to understand number of containers per task, resource utilization etc. If you see that you have low degree of parallelism and still resources enough to instantiate more containers then you have an opportunity to adjust the query to allow more parallelism. http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_performance_tuning/content/ch_query_optimization_hive.html Change the path from the link to match your version of HDP.

cstanca · ‎12-26-2016

@Simran Kaur You need to use a join or a sub-query within the sub-query. You can access out query from inner query. It is the other way around and by design for any SQL-like language.

cstanca · ‎12-26-2016

@Fish Berh This could have due to a problem with the spark-csv jar. i have encountered this myself and I found a solution which I cannot find now. Here are my notes at the time: 1. Create a folder in your local OS or HDFS and place the proper versions for your case of the jars here (replace ? with your version needed): spark-csv_?.jar commons-csv-?.jar univocity-parsers-?.jar 2. Go to your /conf directory where you have installed Spark and in the spark-defaults.conf file add the line: spark.driver.extraClassPath D:/Spark/spark_jars/* The asterisk should include all the jars. Now run Python, create SparkContext, SQLContext as you normally would. Now you should be able to use spark-csv as sqlContext.read.format('com.databricks.spark.csv').\ options(header='true', inferschema='true').\ load('foobar.csv')

cstanca · ‎12-26-2016

@kishore sanchina Your spark use must be able to create folder under that /tmp/spark-tmp. Based on your comments you did not grant ownership successfully. You should grant recursive ownership of /tmp as such that it will include all subfolders existent or created at runtime: chown spark -R /tmp I assumed your user is spark. However, I really don't like the idea of using /tmp for that (SA taste).You should use maybe a folder created under SPARK_HOME.

cstanca · ‎12-26-2016

@Timothy Spann Added @Chris Nauroth to the thread. He is a mentor in this Apache project.

cstanca · ‎12-26-2016

@Raghvendra Singh Tutorial: http://hortonworks.com/hadoop-tutorial/getting-started-with-pivotal-hawq-on-hortonworks-sandbox/ Look for section USING OTHER TOOLS TO WORK WITH HAWQ Follow instructions on how to download ODBC/JDBC driver and how to use it. If your data stored is JSON then you are set, otherwise you have to handle it before displaying to your d3js based dashboard. HAWQ is SQL-like database with advance ANSI compliance.

cstanca · ‎12-26-2016

@Manoj Ramakrishnan You should try to reduce the heap to the minimum required and also use g1gc. This is due to a GC tuning issue.

Online	Offline
Last Visited	‎03-22-2019 03:12 AM

Member Since	‎03-16-2016 04:06 PM
Last Visited	‎03-22-2019 03:12 AM
Posts	707
Kudos received	1728

Cloudera Community

Re: 5th attempt at getting an answer to this quest...

Re: Trying to reinstall Apache NiFi 1.5 on HDF 3.1

Re: Is it mandatory that we should have exact moun...

Re: Alternate to smartsense

Re: Tracking of Hive tables metadata changes in re...

Re: Can we automatically sync ldap users into amba...

Re: Encoded values coming back from HBase API

Re: unable to write hive query output to s3

Re: Hive insert statement is very slow

Re: correlated subquery in hive

Re: Reading data from HDFS on AWS EC2 cluster

Re: Hadoop : SparkSQL context to include Hive

Re: Apache Airflow

Re: How to pull data out of HAWQ for use in Live ...

Re: Kafka CPU busy with GC