About dblive

dblive · ‎06-07-2019

i really love hdp especially ambari . could somebody answer my question since i noticed no more new hdp release for a while.

dblive · ‎01-11-2019

my final solution is install hbase and my the real base as storage for both ats and ambari metrics .error cleared

dblive · ‎01-08-2019

hi Geoffrey, I tried by the problem still there .though it's not a big problem for my yarn application .

dblive · ‎01-07-2019

same thing here ,even after restart everything the "The HBase application reported a 'STARTED' state" is there.

dblive · ‎08-05-2018

hi Aditya, Thank you for the response . The issue was related to when using spark to write to hive ,now have to provide the table format as below df.write.format("orc").mode("overwrite").saveAsTable("tt") # this run good df.write.mode("overwrite").saveAsTable("tt") # this command will fail I didn't change anything on hive tab after hdp 3.0 installed .

dblive · ‎08-03-2018

Hi ,just doing some testing on newly posted hdp 3.0. and the example failed . I tested same script on previous HDP platform, works fine. can someone advice it's hive's new feature or anything I have done wrong? ./bin/spark-submit examples/src/main/python/sql/hive.py Hive Session ID = bf71304b-3435-46d5-93a9-09ef752b6c22 AnalysisExceptionTraceback (most recent call last) /usr/hdp/3.0.0.0-1634/spark2/examples/src/main/python/sql/hive.py in <module>() 44 45 # spark is an existing SparkSession 46 spark.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING) USING hive") 47 spark.sql("LOAD DATA LOCAL INPATH 'examples/src/main/resources/kv1.txt' INTO TABLE src") 48 /usr/hdp/3.0.0.0-1634/spark2/python/lib/pyspark.zip/pyspark/sql/session.py in sql(self, sqlQuery) 714 [Row(f1=1, f2=u'row1'), Row(f1=2, f2=u'row2'), Row(f1=3, f2=u'row3')] 715 """ 716 return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped) 717 718 @since(2.0) /usr/hdp/3.0.0.0-1634/spark2/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py in __call__(self, *args) 1255 answer = self.gateway_client.send_command(command) 1256 return_value = get_return_value( -> 1257 answer, self.gateway_client, self.target_id, self.name) 1258 1259 for temp_arg in temp_args: /usr/hdp/3.0.0.0-1634/spark2/python/lib/pyspark.zip/pyspark/sql/utils.py in deco(*a, **kw) 67 e.java_exception.getStackTrace())) 68 if s.startswith('org.apache.spark.sql.AnalysisException: '): 69 raise AnalysisException(s.split(': ', 1)[1], stackTrace) 70 if s.startswith('org.apache.spark.sql.catalyst.analysis'): 71 raise AnalysisException(s.split(': ', 1)[1], stackTrace) AnalysisException: u'org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Table default.src failed strict managed table checks due to the following reason: Table is marked as a managed table but is not transactional.);' much appreciated!

dblive · ‎07-07-2018

thank you very much ,that' my bad ,I had added some other jars in my class path leading to this error.

dblive · ‎07-07-2018

Hi , I'm using latest HDP ,version is 2.6.5.0-292. spark version is 2.3.0 when I'm trying to run show() from any DataFrame ,it always throw error : scala> spark.read.csv("/user/a.txt").show() java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)V at org.apache.spark.io.LZ4CompressionCodec.compressedInputStream(CompressionCodec.scala:122) at org.apache.spark.sql.execution.SparkPlan.org$apache$spark$sql$execution$SparkPlan$decodeUnsafeRows(SparkPlan.scala:274) at org.apache.spark.sql.execution.SparkPlan$anonfun$executeTake$1.apply(SparkPlan.scala:366) at org.apache.spark.sql.execution.SparkPlan$anonfun$executeTake$1.apply(SparkPlan.scala:366) at scala.collection.TraversableLike$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.mutable.ArrayOps$ofRef.flatMap(ArrayOps.scala:186) at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:366) at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38) at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$collectFromPlan(Dataset.scala:3272) at org.apache.spark.sql.Dataset$anonfun$head$1.apply(Dataset.scala:2484) at org.apache.spark.sql.Dataset$anonfun$head$1.apply(Dataset.scala:2484) at org.apache.spark.sql.Dataset$anonfun$52.apply(Dataset.scala:3253) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252) at org.apache.spark.sql.Dataset.head(Dataset.scala:2484) at org.apache.spark.sql.Dataset.take(Dataset.scala:2698) at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.infer(CSVDataSource.scala:148) at org.apache.spark.sql.execution.datasources.csv.CSVDataSource.inferSchema(CSVDataSource.scala:63) at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:57) at org.apache.spark.sql.execution.datasources.DataSource$anonfun$8.apply(DataSource.scala:202) at org.apache.spark.sql.execution.datasources.DataSource$anonfun$8.apply(DataSource.scala:202) at scala.Option.orElse(Option.scala:289) at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:201) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:392) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:596) at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:473) I've tried both pyspark and spark-shell on 3 sets of newly installed hdp 2.6.5.0-292. the DataFrame writing function works well ,only show() throws the error. are there anyone encountered same issue as I had? how to fix this problem?

dblive · ‎11-19-2017

this is really great I think the key point here is as -Dhdp.version it's is still working for hdp version 2.6.3.0-235 spark.driver.extraJavaOptions -Dhdp.version=2.5.0.0-817 spark.yarn.am.extraJavaOptions -Dhdp.version=2.5.0.0-817

dblive · ‎06-12-2016

Just checked the pom.xml file for phoenix 4.7 ,it's based on hadoop 2.5.1 which the container id should looks like container_1465095377475_0007_02_000001, while in hadoop 2.7.1 the container id should looks like container_e03_1465095377475_0007_02_000001. So the old version of class org.apache.hadoop.yarn.util.ConverterUtils.toContainerId couldn't handle the new version's container . I should address this problem in phoenix comminity either.

Online	Offline
Last Visited	‎07-20-2024 06:24 AM

Member Since	‎02-06-2017 10:52 AM
Last Visited	‎07-20-2024 06:24 AM
Posts	20
Kudos received	4

Cloudera Community

Re: In hdp 3.0 can't create hive table in spark fa...

Re: 2.4.2 spark-submit got Invalid ContainerId

will be new release of HDP after the merge with cl...

Re: ATS hbase does not seem to start

Re: ATS hbase does not seem to start

Re: ATS hbase does not seem to start

Re: In hdp 3.0 can't create hive table in spark fa...

In hdp 3.0 can't create hive table in spark failed

Re: lastest HDP 2.6.5.0-292 DataFrame show() throw...

lastest HDP 2.6.5.0-292 DataFrame show() throws an...

Re: How to install and run Spark 2.0 on HDP 2.5 Sa...

Re: 2.4.2 spark-submit got Invalid ContainerId