Member since
10-01-2015
3933
Posts
1150
Kudos Received
374
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3362 | 05-03-2017 05:13 PM | |
2791 | 05-02-2017 08:38 AM | |
3067 | 05-02-2017 08:13 AM | |
3002 | 04-10-2017 10:51 PM | |
1510 | 03-28-2017 02:27 AM |
02-28-2017
04:36 PM
2 Kudos
@Adnan Alvee use ORC format with HCatalog integration in Pig, take a look at my article https://community.hortonworks.com/articles/83051/apache-ambari-workflow-designer-view-for-apache-oo-2.html
... View more
02-28-2017
12:32 PM
have you tried the following? import org.apache.hadoop.fs._
import org.apache.spark.deploy.SparkHadoopUtil
import java.net.URI
val hdfs_conf = SparkHadoopUtil.get.newConfiguration(sc.getConf)
val hdfs = FileSystem.get(hdfs_conf)
... View more
02-28-2017
03:05 AM
Here's the latest Ambari doc with the same http://docs.hortonworks.com/HDPDocuments/Ambari-2.4.2.0/bk_ambari-security/content/optional_ambari_web_inactivity_timeout.html
... View more
02-28-2017
02:58 AM
1 Kudo
Please see this https://docs.hortonworks.com/HDPDocuments/Ambari-2.2.2.0/bk_Ambari_Security_Guide/content/_optional_ambari_web_inactivity_timeout.html
... View more
02-27-2017
09:36 PM
1 Kudo
@Mehrdad Niasari for an example to run Python2 and Python3 please see my articles https://community.hortonworks.com/articles/82967/apache-ambari-workflow-designer-view-for-apache-oo.html https://community.hortonworks.com/articles/82988/apache-ambari-workflow-designer-view-for-apache-oo-1.html this covers a new workflow editing tool called Workflow Manager but same steps can be applied to writing pure XML workflows. Requirement here is that all Python libs should be available on every nodemanager. If you're on Kerborized cluster, Oozie will proxy the user permissions to user executing Oozie process, so paying attention to permissions across the whole workflow life cycle is also important. For good measure here's my article on shell action alone https://community.hortonworks.com/articles/82964/getting-started-with-apache-ambari-workflow-design.html Let me know if you run into any problems.
... View more
02-27-2017
03:07 PM
@Avijeet Dash
there are a few options, here's a great article by one of our engineers https://community.hortonworks.com/articles/70658/how-to-diagnose-zeppelin.html to see it in action, here's a short article that demonstrates remote debug in action http://lresende.blogspot.com/2016/08/launching-apache-zeppelin-in-debug-mode.html
... View more
02-27-2017
03:01 PM
@Param NC you need to build your application with hadoop-client dependency in your pom.xml or sbt, for scope, supply <scope>provided</scope>. http://spark.apache.org/docs/1.6.2/submitting-applications.html Bundling Your Application’s Dependencies If your code depends on other projects, you will need to package them alongside your application in order to distribute the code to a Spark cluster. To do this, create an assembly jar (or “uber” jar) containing your code and its dependencies. Both sbt and Maven have assembly plugins. When creating assembly jars, list Spark and Hadoop as provided dependencies; these need not be bundled since they are provided by the cluster manager at runtime. Once you have an assembled jar you can call the bin/spark-submit script as shown here while passing your jar. For Python, you can use the --py-files argument of spark-submit to add .py , .zip or .egg files to be distributed with your application. If you depend on multiple Python files we recommend packaging them into a .zip or .egg . More info here http://spark.apache.org/docs/1.6.2/running-on-yarn.html Here's a sample pom.xml definition for hadoop-client <dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.1.2.3.0.0-2557</version>
<scope>provided</scope>
<type>jar</type>
</dependency>
</dependencies>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.7</maven.compiler.source>
<maven.compiler.target>1.7</maven.compiler.target>
</properties>
<repositories>
<repository>
<id>HDPReleases</id>
<name>HDP Releases</name>
<url>http://repo.hortonworks.com/content/repositories/public</url>
<layout>default</layout>
<releases>
<enabled>true</enabled>
<updatePolicy>always</updatePolicy>
<checksumPolicy>warn</checksumPolicy>
</releases>
<snapshots>
<enabled>false</enabled>
<updatePolicy>never</updatePolicy>
<checksumPolicy>fail</checksumPolicy>
</snapshots>
</repository>
<repository>
<id>HDPJetty</id>
<name>Hadoop Jetty</name>
<url>http://repo.hortonworks.com/content/repositories/jetty-hadoop/</url>
<layout>default</layout>
<releases>
<enabled>true</enabled>
<updatePolicy>always</updatePolicy>
<checksumPolicy>warn</checksumPolicy>
</releases>
<snapshots>
<enabled>false</enabled>
<updatePolicy>never</updatePolicy>
<checksumPolicy>fail</checksumPolicy>
</snapshots>
</repository>
<repository>
<snapshots>
<enabled>false</enabled>
</snapshots>
<id>central</id>
<name>bintray</name>
<url>http://jcenter.bintray.com</url>
</repository>
</repositories>
... View more
02-27-2017
02:08 PM
1 Kudo
Use a create external table syntax on top of your data. Main thing to remember is create external and location in the syntax below, the rest depends on your file type and delimiter. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_dataintegration/content/moving_data_from_hdfs_to_hive_external_table_method.html CREATE EXTERNAL TABLE IF NOT EXISTS Cars(
Name STRING,
Miles_per_Gallon INT,
Cylinders INT,
Displacement INT,
Horsepower INT,
Weight_in_lbs INT,
Acceleration DECIMAL,
Year DATE,
Origin CHAR(1))
COMMENT 'Data about cars from a public database'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
location '/user/<username>/visdata';
... View more
02-27-2017
01:15 PM
It supports both Spark and Pyspark so you're not missing out on anything
... View more
02-27-2017
01:14 PM
3 Kudos
For Zeppelin in HDP 2.5 we introduced a new interpreter called Livy and it has its own way of managing dependency. Please look here http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_zeppelin-component-guide/content/zepp-with-spark.html
... View more