Member since
03-02-2018
4
Posts
0
Kudos Received
0
Solutions
10-09-2018
05:57 AM
hi Sindhu, i have not done any configuration changes while installing i have follow all the hortonworks docs to install druid can you tell me on which config need to change. Thanks.
... View more
10-04-2018
06:59 AM
logs.ziphi all, i am new in druid i have done ambari setup with druid service setup is completed successfully. running coordinator console and overload console, i have Kafka as input data source i have created job to run the druid job and tested the kafka topic running state but my druid override console not showing in running state jobs. sharing log and screen shot for reference. please guide me where to change in configurations.
... View more
Labels:
- Labels:
-
Apache Ambari
04-11-2018
07:37 AM
I am try to push data in existing hive table, i have already created
orc table in hive not able to push data in hive. this code is work if i
copy paste on spark console but not able to run by spark-submit. import org.apache.spark.SparkConfimport org.apache.spark.SparkContextobjectTestCode{def main(args:Array[String]):Unit={val conf =newSparkConf().setAppName("first example").setMaster("local")val sc =newSparkContext(conf)val sqlContext =new org.apache.spark.sql.SQLContext(sc)for(i <-0 to 100-1){// sample value but it replace with business logic. and try to push into table.for loop consider as business logic.var fstring ="fstring"+ i
var cmd ="cmd"+ i
var idpath ="idpath"+ i
import sqlContext.implicits._
val sDF =Seq((fstring, cmd, idpath)).toDF("t_als_s_path","t_als_s_cmd","t_als_s_pd")
sDF.write.insertInto("l_sequence");//sDF.write.format("orc").saveAsTable("l_sequence");
println("write data ==> "+ i)}} Giving the error. Exception in thread "main" org.apache.spark.sql.AnalysisException:Table or view not found: l_sequence;
at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$lookupTableFromCatalog(Analyzer.scala:449)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$8.applyOrElse(Analyzer.scala:455)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$anonfun$apply$8.applyOrElse(Analyzer.scala:453)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$anonfun$resolveOperators$1.apply(LogicalPlan.scala:61)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$anonfun$resolveOperators$1.apply(LogicalPlan.scala:61)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:60)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:453)
at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:443)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1$anonfun$apply$1.apply(RuleExecutor.scala:85)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1$anonfun$apply$1.apply(RuleExecutor.scala:82)
at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124)
at scala.collection.immutable.List.foldLeft(List.scala:84)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1.apply(RuleExecutor.scala:82)
at org.apache.spark.sql.catalyst.rules.RuleExecutor$anonfun$execute$1.apply(RuleExecutor.scala:74)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:74)
at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:65)
at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:63)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:51)
at org.apache.spark.sql.execution.QueryExecution.withCachedData$lzycompute(QueryExecution.scala:69)
at org.apache.spark.sql.execution.QueryExecution.withCachedData(QueryExecution.scala:68)
at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:74)
at org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:74)
at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:78)
at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:76)
at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:83)
at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:83)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
at org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:259)
at org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:239)
at com.hq.bds.Helloword$anonfun$main$1.apply$mcVI$sp(Helloword.scala:16)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at com.hq.bds.Helloword$.main(Helloword.scala:10)
at com.hq.bds.Helloword.main(Helloword.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:729)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
03-02-2018
02:41 PM
Hi All, i am processing kafka data by spark and push in to hive tables while insert into table face an issue in warehouse location it create new part file for every insert command please share some solution to avoid that problem for single select statement it will take more than 30 min. import spark.implicits._
// Every time get new data by kafka consumer. assing to jsonStr string.
val jsonStr ="""{"b_s_isehp" : "false","event_id" : "4.0","l_bsid" : "88.0"}"""
val df = spark.read.json(Seq(jsonStr).toDS)
df.coalesce(1).write.mode("append").insertInto("tablename")
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Kafka
-
Apache Spark