Member since
04-25-2016
579
Posts
609
Kudos Received
111
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2933 | 02-12-2020 03:17 PM | |
| 2138 | 08-10-2017 09:42 AM | |
| 12483 | 07-28-2017 03:57 AM | |
| 3427 | 07-19-2017 02:43 AM | |
| 2528 | 07-13-2017 11:42 AM |
12-21-2016
05:41 AM
@Hoang Le the maximum is calculated based on yarn.scheduler.capacity.maximum-am-resource-percent, seems you are launching am for one of the job which is configured to get 4GB of memory thats why you are seeing used am resources as 4GB which is the max-am-resource.
... View more
12-21-2016
05:23 AM
2 Kudos
@Dmitry Otblesk could you please try running the query again int tez mode after setting this parameter in hive shell set hive.tez.exec.print.summary=true; and share the summary of execution.
... View more
12-21-2016
05:19 AM
@Satish Bomma glad to see it working, could you please accept best answer in this thread so that other user can get benefit from it.
... View more
12-21-2016
05:15 AM
4 Kudos
@Hoang Le if you look at the Max Application Master Resources,Used Application Master Resources both are same(4G), so this is hitting resource cap for application master and waiting to allocate Application Master container.
... View more
12-20-2016
02:19 PM
2 Kudos
while running the sqoop command from the java program with -verbose option can result into race condition during obtaining lock on the console appender.we can workaround this with the help of SSHXCUTE framework which will create java program and sqoop command context separately. ENV: HDP 2.4 Java Version : JDK-8 Step 1: download sshxcute jar from https://sourceforge.net/projects/sshxcute/ Step 2: Create RunSqoopCommand.java import net.neoremind.sshxcute.core.SSHExec;
import net.neoremind.sshxcute.core.ConnBean;
import net.neoremind.sshxcute.task.CustomTask;
import net.neoremind.sshxcute.task.impl.ExecCommand;
public class RunSqoopCommand {
public static void main(String args[]) throws Exception{
ConnBean cb = new ConnBean("localhost", "root","hadoop");
SSHExec ssh = SSHExec.getInstance(cb);
ssh.connect();
CustomTask sqoopCommand = new ExecCommand("sqoop import -Dorg.apache.sqoop.splitter.allow_text_splitter=true
-Dmapred.job.name=test --connect jdbc:oracle:thin:@10.0.2.12:1521:XE
--table TEST_INCREMENTAL -m 1 --username system
--password oracle --target-dir
/tmp/test26
--verbose");
ssh.exec(sqoopCommand);
ssh.disconnect();
}
}
Step 3: compile program javac -cp sshxcute-1.0.jar RunSqoopCommand.java Step 4: Run program java -cp sshxcute-1.0.jar RunSqoopCommand
... View more
Labels:
12-20-2016
01:59 PM
3 Kudos
These are the steps to build and run spark streaming application, it was built and tested on HDP-2.5 setup: ENV: HDP2.5 scala : 2.10.4 sbt: 0.13.11 mkdir spark-streaming-example
cd spark-streaming-example/
mkdir -p src/main/scala
cd src/main/scala sample code: import org.apache.spark.rdd.RDD
import org.apache.spark.sql.{SQLContext, SaveMode}
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.streaming.Seconds
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.streaming.Time;
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.rdd.RDD
import org.apache.spark.streaming.{Time, Seconds, StreamingContext}
import org.apache.spark.util.IntParam
import org.apache.spark.sql.SQLContext
import org.apache.spark.storage.StorageLevel
object SqlNetworkWordCount {
def main(args: Array[String]) {
val sparkConf = new SparkConf().setAppName("SqlNetworkWordCount")
val ssc = new StreamingContext(sparkConf, Seconds(2))
val lines = ssc.socketTextStream(args(0), args(1).toInt, StorageLevel.MEMORY_AND_DISK_SER)
val words = lines.flatMap(_.split(" "))
// Convert RDDs of the words DStream to DataFrame and run SQL query
words.foreachRDD((rdd: RDD[String], time: Time) => {
val sqlContext = SQLContextSingleton.getInstance(rdd.sparkContext)
import sqlContext.implicits._
val wordsDataFrame = rdd.map(w => Record(w)).toDF()
wordsDataFrame.write.mode(SaveMode.Append).parquet("/tmp/parquet");
})
ssc.start()
ssc.awaitTermination()
}
}
case class Record(word: String)
object SQLContextSingleton {
@transient private var instance: SQLContext = _
def getInstance(sparkContext: SparkContext): SQLContext = {
if (instance == null) {
instance = new SQLContext(sparkContext)
}
instance
}
} cd - vim build.sbt name := "Spark Streaming Example" version := "1.0" scalaVersion := "2.10.4" libraryDependencies ++= Seq("org.apache.spark" %% "spark-core" % "1.4.1","org.apache.spark" %% "spark-streaming" % "1.4.1") *Now run sbt package from project home and it will build a jar inside target/scala-2.10/spark-streaming-example_2.10-1.0.jar
*Run this jar using spark-submit
#bin/spark-submit --class TestStreaming target/scala-2.10/spark-streaming-example_2.10-1.0.jar hostname 6002 to test this program open a different terminal and run nc -lk `hostname` 6002 hit enter and
type anything on console while will display on the spark console.
... View more
Labels:
12-20-2016
12:12 PM
4 Kudos
@sathish jeganathan I will suggest you don't Install or configure kafka manually, Install using Ambari it will take care all the settings for you, after installation once broker come online then only you can try changing configuration.
... View more
12-20-2016
06:08 AM
1 Kudo
@Christian van den Heever could you please try with this repo http://public-repo-1.hortonworks.com/HDP/centos7/2.x/updates/2.5.3.0/hdp.repo
... View more
12-20-2016
05:23 AM
3 Kudos
@sathish jeganathan which HDP version you are running with? there is no need to set advertised.listeners from ambari until you are running with mulitple interfaces on the same host. in ambari you should have only listener setting like this SASL_PLAINTEXT://localhost:6667 it looks you modified server.properties manually thats why you are seeing advertise.listener setting there
... View more
12-19-2016
05:07 PM
2 Kudos
@Dinesh Das try this wget http://hortonassets.s3.amazonaws.com/2.5/HDP_2.5_docker.tar.gz
... View more