About nsabharwal

nsabharwal · ‎03-07-2016

@Michael Dennis Uanang If version is not up to date the Yes 🙂

nsabharwal · ‎03-07-2016

@Colton Rodgers http://docs.hortonworks.com/HDPDocuments/Ambari-2.2.1.0/bk_Installing_HDP_AMB/content/_download_the_ambari_repo.html Please try again with the correct repo file as per your OS

nsabharwal · ‎03-06-2016

Go to the Security Group settings in the left hand navigation Find the Security Group that your instance is apart of Click on Inbound Rules Use the drop down and add HTTP (port 8080) Click Apply

nsabharwal · ‎03-06-2016

@Robin Dong You have to add 8080 port in your network security settings. See this http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/authorizing-access-to-an-instance.html

nsabharwal · ‎03-06-2016

@Robin Dong You have to hit public IP:8080 http://publicIP:8080 Please follow this http://hortonworks.com/blog/deploying-hadoop-cluster-amazon-ec2-hortonworks/

nsabharwal · ‎03-06-2016

@Robin Dong See this https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_Installing_HDP_AMB/content/_using_a_local_repository.html You can download repo from https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_Installing_HDP_AMB/content/_obtaining_the_repositories.html Ambari repo https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_Installing_HDP_AMB/content/_ambari_repositories.html

nsabharwal · ‎03-06-2016

@Jan J Please help me to close the thread if it's useful

nsabharwal · ‎03-06-2016

@Jan J You have options to access data. As hive/HQL is the industry standard to interact with Hadoop so users are leveraging sparsql+hive Please read the overview http://spark.apache.org/docs/latest/sql-programming-guide.html#overview

nsabharwal · ‎03-06-2016

@Jan J I would store in HDFS. I would leverage Hive...See this http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables When you use spark to read hive tables then you are using spark features to read data from hive. In terms of performance, we are not really testing spark sql performance, but hiveql instead? You are testing performance of SparkSql feature with Hive All the answers are in the overview http://spark.apache.org/docs/latest/sql-programming-guide.html#overview SQL One use of Spark SQL is to execute SQL queries written using either a basic SQL syntax or HiveQL. Spark SQL can also be used to read data from an existing Hive installation. For more on how to configure this feature, please refer to the Hive Tables section. When running SQL from within another programming language the results will be returned as a DataFrame. You can also interact with the SQL interface using thecommand-line or over JDBC/ODBC. DataFrames A DataFrame is a distributed collection of data organized into named columns. It is conceptually equivalent to a table in a relational database or a data frame in R/Python, but with richer optimizations under the hood. DataFrames can be constructed from a wide array of sources such as: structured data files, tables in Hive, external databases, or existing RDDs. The DataFrame API is available in Scala, Java, Python, and R. Datasets A Dataset is a new experimental interface added in Spark 1.6 that tries to provide the benefits of RDDs (strong typing, ability to use powerful lambda functions) with the benefits of Spark SQL’s optimized execution engine. A Dataset can be constructed from JVM objects and then manipulated using functional transformations (map, flatMap, filter, etc.). The unified Dataset API can be used both in Scala and Java. Python does not yet have support for the Dataset API, but due to its dynamic nature many of the benefits are already available (i.e. you can access the field of a row by name naturally row.columnName ). Full python support will be added in a future release.

nsabharwal · ‎03-05-2016

@S Srinivasa You are mixing lot of issues in one thread and there is high probability that it will slow down the process to get the effective response.

Online	Offline
Last Visited	‎07-18-2019 05:10 PM

Member Since	‎09-18-2015 05:49 PM
Last Visited	‎07-18-2019 05:10 PM
Posts	3,274
Kudos received	1129

Cloudera Community

Re: Is Ranger KMS Encryption FIPS 140-2 compliant ...

Re: How to add another HiveServer for current meta...

Re: FQDNs - are they necessary?

Re: java.io.FileNotFoundException: (Is a director...

Re: Need Design/Architecture Suggestion on HDP & H...

Re: Kafka Version on HDP-2.3.2?

Re: Ambari Install of HDP 2.4 Continually Failing

Re: need help on AWS/EC2 to start ambari by intern...

Re: need help on AWS/EC2 to start ambari by intern...

Re: need help on AWS/EC2 to start ambari by intern...

Re: after tar -zxvf ambari-2.1.2.1-centos6.tar.gz,...

Re: How spark works to analyze huge databases

Re: How spark works to analyze huge databases

Re: Test spark sql performance

Re: Why I am having problem with Ambari Hadoop Ser...