Member since 
    
	
		
		
		04-24-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                106
            
            
                Posts
            
        
                13
            
            
                Kudos Received
            
        
                7
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 1960 | 11-25-2019 12:49 AM | |
| 3117 | 11-14-2018 10:45 AM | |
| 2764 | 10-15-2018 03:44 PM | |
| 2777 | 09-25-2018 01:54 PM | |
| 2920 | 08-03-2018 09:47 AM | 
			
    
	
		
		
		11-25-2019
	
		
		12:49 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 To answer my own question: Since I'm using multiple partitions for the Kafka topic, Spark uses more executors to process the data. Also Hive/Tez creates as many worker containers as the topic contains partitions. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-24-2019
	
		
		11:18 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I wrote a Kafka producer, that sends some simulated data to a Kafka stream (replication-factor 3, one partition). 
   
 Now, I want to access this data by using Hive and/or Spark Streaming.  
   
 First approach: Using an external Hive table with KafkaStorageHandler: 
 CREATE EXTERNAL TABLE mydb.kafka_timeseriestest (    description string,    version int,    ts timestamp,    varname string,    varvalue float  )  STORED BY 'org.apache.hadoop.hive.kafka.KafkaStorageHandler'  TBLPROPERTIES (    "kafka.topic" = "testtopic",     "kafka.bootstrap.servers"="server1:6667,server2:6667,server3:6667"  );    -- e.g. SELECT max(varvalue) from mydb.kafka_timeseriestest;   -- takes too long, and only one Tez task is running 
   
 Second approach: Writing a Spark Streaming app, that accesses the Kafka topic: 
 // started with 10 executors, but only one executor is active    ...  JavaInputDStream<ConsumerRecord<String, String>> stream = KafkaUtils.createDirectStream(jssc, LocationStrategies.PreferConsistent(), ConsumerStrategies.<String, String>Subscribe(topics, kafkaParams));  ... 
   
 In both cases, only one Tez/Spark worker is active. Therefore reading all data (~500 million entries) takes a very long time. How can I increase the performance? Is the issue caused by the one-partition topic? If yes, is there a rule of thumb according to which the number of partitions should be determined? 
   
 I'm using a HDP 3.1 cluster, running Spark, Hive and Kafka on multiple nodes: 
 
 dataNode1 - dataNode3: Hive + Spark + Kafka broker 
 dataNode4 - dataNode8: Hive + Spark  
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		08-29-2019
	
		
		05:28 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 I've upgraded to a HDP 3.1 and now want to read a Hive external table in my Spark application.  The following table shows the compatibilites: https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.1.0/integrating-hive/content/hive_configure_a_spark_hive_connection.html     I don't have LLAP activated, so it seems that I'm restricted on the Spark -> Hive access and vice-versa, right?     But the compatibility table sais, that I can access external Hive tables by Spark without using the HWC (and also without LLAP), but with the hint that the Table must be defined in Spark catalog. What do I have to do here?     I tried the following code, but it sais Table not found!     SparkSession session = SparkSession.builder()    .config("spark.executor.instances", "4")    .master("yarn-client")    .appName("Spark LetterCount")    .config("hive.metastore.uris", "thrift://myhost.com:9083")    .config("hive.metastore.warehouse.dir", "/warehouse/tablespace/managed/hive")    .config("hive.metastore.warehouse.external.dir", "/warehouse/tablespace/external/hive")    .config("spark.sql.warehouse.dir", new File("spark-warehouse").getAbsolutePath())    .config("spark.sql.hive.hiveserver2.jdbc.url", "jdbc:hive2://localhost:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;user=student30")    .enableHiveSupport();    Dataset<Row> dsRead = session.sql("SELECT * FROM hivedb.external_table");  System.out.println(dsRead.count());         Exception in thread "main" org.apache.spark.sql.AnalysisException: Table or view not found: `hivedb`.`external_table`; line 1 pos 14;  'Project [*]  +- 'UnresolvedRelation `hivedb`.`external_table`    at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)  at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:86)  at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:84)  at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:127)  at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:126)  at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:126)  at scala.collection.immutable.List.foreach(List.scala:381)  at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:126)  at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:84)  at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:92)  at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:105)  at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:57)  at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:55)  at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:47)  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:74)  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)  at main.SparkSQLExample.main(SparkSQLExample.java:41)  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)  at java.lang.reflect.Method.invoke(Method.java:498)  at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)  at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:904)  at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)  at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)  at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)  at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)   Can someone help me, to solve the issue? Thank you! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		11-14-2018
	
		
		10:45 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I found the following Java based solution for me: Using the Dataset.filter method with FilterFunction: https://spark.apache.org/docs/2.3.0/api/java/index.html?org/apache/spark/sql/Dataset.html  So, my code now looks like this:  Dataset<Row> dsResult = sqlC.read()
  .format("org.apache.phoenix.spark")
  .option("table", tableName)
  .option("zkUrl", hbaseUrl).load()
  .where("OTHER_COLUMN = " + inputId)
  .filter(row -> {
	long readTime = row.getTimestamp(row.fieldIndex("TABLE_TS_COL")).getTime();		
	long tsFrom = new Timestamp(sdf.parse(dateFrom).getTime()).getTime();
	long tsTo = new Timestamp(sdf.parse(dateTo).getTime()).getTime();
					
	return readTime >= tsFrom && readTime <= tsTo;
  });
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-14-2018
	
		
		08:10 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have a Phoenix Table, that I can access via SparkSQL (with Phoenix Spark Plugin). The table has also a Timestamp column.   I have to filter this Timestamp column by a user input, like 2018-11-14 01:02:03. So I want to filter my Dataset (that represents the read Phoenix table) with the where / filter methods.  My actual Java code looks the following:  Timestamp t1 = new Timestamp(sdf.parse(dateFrom).getTime());
Timestamp t2 = new Timestamp(sdf.parse(dateTo).getTime());
		
Column c1 = new Column("TABLE_TS_COL").geq(t1);
Column c2 = new Column("TABLE_TS_COL").leq(t2);
		
Dataset<Row> dsResult = sqlContext.read()
  .format("org.apache.phoenix.spark")
  .option("table", tableName)
  .option("zkUrl", hbaseUrl).load()
			
  .where("OTHER_COLUMN = " + inputId) // This works
  .where(c1)  // Problem!
  .where(c2)  // Problem!
  But this leads to follwoing exception:  java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.phoenix.exception.PhoenixParserException: ERROR 604 (42P00): Syntax error. Mismatched input. Expecting "RPAREN", got "06" at line 1, column 474.  My Spark History UI shows the following select statement:  ...
18/11/14 08:54:58 INFO PhoenixInputFormat: Select Statement: SELECT "OTHER_COLUMN", "TABLE_TS_COL" FROM HBASE_TEST3 WHERE ( "OTHER_COLUMN" = 0 AND "OTHER_COLUMN" IS NOT NULL AND "TABLE_TS_COL" IS NOT NULL AND "TABLE_TS_COL" >= 2018-09-24 06:49:01.0 AND "TABLE_TS_COL" <= 2018-09-24 06:49:01.0)
  For me it looks like the quotation marks are missing for the timestamp values (not sure about that)?  How can I filter a Timestamp column by a user input in Java and SparkSQL? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Phoenix
- 
						
							
		
			Apache Spark
			
    
	
		
		
		10-15-2018
	
		
		03:44 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Solved it - Phoenix Arrays are 1-based, so using the following query solved it:  SELECT REGEXP_SPLIT(ROWKEY, ':')[1] as test, count(1) FROM "my_view" GROUP BY REGEXP_SPLIT(ROWKEY, ':')[1] 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-15-2018
	
		
		03:40 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have a Phoenix View which has a row key column ROWKEY that has a layout like this:  <hash>:<attributeA>_<attributeB>  I want to count the rows of each <hash> value of my table. Therefore I need to group my View by the <hash> value, which I get when I split the RowKey column. I tried to use the REGEXP_SPLIT function of Phoenix, but I get an exception:  %jdbc(phoenix)
SELECT REGEXP_SPLIT(ROWKEY, ':')[0] as test, count(1) FROM "my_view" GROUP BY REGEXP_SPLIT(ROWKEY, ':')[0]
  The exception:  org.apache.phoenix.exception.PhoenixIOException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: my_view,000,1539582312877.93d8d6e785eae60fedac3c6088b4e556.: 32767
	at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:93)
	at org.apache.phoenix.util.ServerUtil.throwIOException(ServerUtil.java:59)
	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:271)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$52.call(RegionCoprocessorHost.java:1301)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1660)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1734)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1699)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1296)
	at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2404)
	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32385)
	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 32767
	at org.apache.phoenix.schema.types.PArrayDataType.positionAtArrayElement(PArrayDataType.java:418)
	at org.apache.phoenix.schema.types.PArrayDataType.positionAtArrayElement(PArrayDataType.java:379)
	at org.apache.phoenix.expression.function.ArrayIndexFunction.evaluate(ArrayIndexFunction.java:64)
	at org.apache.phoenix.util.TupleUtil.getConcatenatedValue(TupleUtil.java:101)
	at org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver.scanUnordered(GroupedAggregateRegionObserver.java:418)
	at org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver.doPostScannerOpen(GroupedAggregateRegionObserver.java:162)
	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:237)
	... 11 more
	at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:117)
	at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:780)
	at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:721)
	at org.apache.phoenix.iterate.MergeSortResultIterator.getMinHeap(MergeSortResultIterator.java:72)
	at org.apache.phoenix.iterate.MergeSortResultIterator.minIterator(MergeSortResultIterator.java:93)
	at org.apache.phoenix.iterate.MergeSortResultIterator.next(MergeSortResultIterator.java:58)
	at org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:64)
	at org.apache.phoenix.iterate.DelegateResultIterator.next(DelegateResultIterator.java:44)
	at org.apache.phoenix.iterate.LimitingResultIterator.next(LimitingResultIterator.java:47)
	at org.apache.phoenix.jdbc.PhoenixResultSet.next(PhoenixResultSet.java:778)
	at org.apache.commons.dbcp2.DelegatingResultSet.next(DelegatingResultSet.java:191)
	at org.apache.commons.dbcp2.DelegatingResultSet.next(DelegatingResultSet.java:191)
	at org.apache.zeppelin.jdbc.JDBCInterpreter.getResults(JDBCInterpreter.java:510)
	at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:694)
	at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:763)
	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:101)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:502)
	at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
	at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: my_view,000,1539582312877.93d8d6e785eae60fedac3c6088b4e556.: 32767
	at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:93)
	at org.apache.phoenix.util.ServerUtil.throwIOException(ServerUtil.java:59)
	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:271)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$52.call(RegionCoprocessorHost.java:1301)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1660)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1734)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1699)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1296)
	at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2404)
	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32385)
	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 32767
	at org.apache.phoenix.schema.types.PArrayDataType.positionAtArrayElement(PArrayDataType.java:418)
	at org.apache.phoenix.schema.types.PArrayDataType.positionAtArrayElement(PArrayDataType.java:379)
	at org.apache.phoenix.expression.function.ArrayIndexFunction.evaluate(ArrayIndexFunction.java:64)
	at org.apache.phoenix.util.TupleUtil.getConcatenatedValue(TupleUtil.java:101)
	at org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver.scanUnordered(GroupedAggregateRegionObserver.java:418)
	at org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver.doPostScannerOpen(GroupedAggregateRegionObserver.java:162)
	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:237)
	... 11 more
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:206)
	at org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:775)
	... 24 more
Caused by: org.apache.phoenix.exception.PhoenixIOException: org.apache.hadoop.hbase.DoNotRetryIOException: my_view,000,1539582312877.93d8d6e785eae60fedac3c6088b4e556.: 32767
	at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:93)
	at org.apache.phoenix.util.ServerUtil.throwIOException(ServerUtil.java:59)
	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:271)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$52.call(RegionCoprocessorHost.java:1301)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1660)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1734)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1699)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1296)
	at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2404)
	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32385)
	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 32767
	at org.apache.phoenix.schema.types.PArrayDataType.positionAtArrayElement(PArrayDataType.java:418)
	at org.apache.phoenix.schema.types.PArrayDataType.positionAtArrayElement(PArrayDataType.java:379)
	at org.apache.phoenix.expression.function.ArrayIndexFunction.evaluate(ArrayIndexFunction.java:64)
	at org.apache.phoenix.util.TupleUtil.getConcatenatedValue(TupleUtil.java:101)
	at org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver.scanUnordered(GroupedAggregateRegionObserver.java:418)
	at org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver.doPostScannerOpen(GroupedAggregateRegionObserver.java:162)
	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:237)
	... 11 more
	at org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:117)
	at org.apache.phoenix.iterate.TableResultIterator.initScanner(TableResultIterator.java:252)
	at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:113)
	at org.apache.phoenix.iterate.ParallelIterators$1.call(ParallelIterators.java:108)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask.run(JobManager.java:183)
	... 3 more
Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: my_view,000,1539582312877.93d8d6e785eae60fedac3c6088b4e556.: 32767
	at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:93)
	at org.apache.phoenix.util.ServerUtil.throwIOException(ServerUtil.java:59)
	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:271)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$52.call(RegionCoprocessorHost.java:1301)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1660)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1734)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1699)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1296)
	at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2404)
	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32385)
	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 32767
	at org.apache.phoenix.schema.types.PArrayDataType.positionAtArrayElement(PArrayDataType.java:418)
	at org.apache.phoenix.schema.types.PArrayDataType.positionAtArrayElement(PArrayDataType.java:379)
	at org.apache.phoenix.expression.function.ArrayIndexFunction.evaluate(ArrayIndexFunction.java:64)
	at org.apache.phoenix.util.TupleUtil.getConcatenatedValue(TupleUtil.java:101)
	at org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver.scanUnordered(GroupedAggregateRegionObserver.java:418)
	at org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver.doPostScannerOpen(GroupedAggregateRegionObserver.java:162)
	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:237)
	... 11 more
	at sun.reflect.GeneratedConstructorAccessor38.newInstance(Unknown Source)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
	at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
	at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:335)
	at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:391)
	at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:208)
	at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:63)
	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:211)
	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:396)
	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:370)
	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:136)
	at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:80)
	... 3 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException): org.apache.hadoop.hbase.DoNotRetryIOException: my_view,000,1539582312877.93d8d6e785eae60fedac3c6088b4e556.: 32767
	at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:93)
	at org.apache.phoenix.util.ServerUtil.throwIOException(ServerUtil.java:59)
	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:271)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$52.call(RegionCoprocessorHost.java:1301)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1660)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1734)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1699)
	at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postScannerOpen(RegionCoprocessorHost.java:1296)
	at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2404)
	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32385)
	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 32767
	at org.apache.phoenix.schema.types.PArrayDataType.positionAtArrayElement(PArrayDataType.java:418)
	at org.apache.phoenix.schema.types.PArrayDataType.positionAtArrayElement(PArrayDataType.java:379)
	at org.apache.phoenix.expression.function.ArrayIndexFunction.evaluate(ArrayIndexFunction.java:64)
	at org.apache.phoenix.util.TupleUtil.getConcatenatedValue(TupleUtil.java:101)
	at org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver.scanUnordered(GroupedAggregateRegionObserver.java:418)
	at org.apache.phoenix.coprocessor.GroupedAggregateRegionObserver.doPostScannerOpen(GroupedAggregateRegionObserver.java:162)
	at org.apache.phoenix.coprocessor.BaseScannerRegionObserver.postScannerOpen(BaseScannerRegionObserver.java:237)
	... 11 more
	at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1227)
	at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:218)
	at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:292)
	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32831)
	at org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:383)
	... 10 more
  How is the correct query?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Phoenix
			
    
	
		
		
		09-25-2018
	
		
		01:54 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 The problem was solved after changing the MySQL Database URL from  jdbc:mysql://xxxx.yyyy/hive?createDatabaseIfNotExist=true  to  jdbc:mysql://xxxx.yyyy/hive?createDatabaseIfNotExist=true&serverTimezone=Europe/Berlin  I found the relevant information here: https://community.hortonworks.com/questions/218023/error-setting-up-hive-on-hdp-265timezone-on-mysql.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-25-2018
	
		
		01:27 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I'm just setting up a Hortonworks Data Platform 3.0 installation.   When I want to start the services (first time), the Hive Metastore start brings an exception:  Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
Underlying cause: java.sql.SQLException : The server time zone value 'CEST' is unrecognized or represents more than one time zone. You must configure either the server or JDBC driver (via the serverTimezone configuration property) to use a more specifc time zone value if you want to utilize time zone support.
SQL Error code: 0
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
	at org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getConnectionToMetastore(HiveSchemaHelper.java:94)
	at org.apache.hive.beeline.HiveSchemaTool.getConnectionToMetastore(HiveSchemaTool.java:169)
	at org.apache.hive.beeline.HiveSchemaTool.testConnectionToMetastore(HiveSchemaTool.java:475)
	at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:581)
	at org.apache.hive.beeline.HiveSchemaTool.doInit(HiveSchemaTool.java:567)
	at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1539)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
Caused by: java.sql.SQLException: The server time zone value 'CEST' is unrecognized or represents more than one time zone. You must configure either the server or JDBC driver (via the serverTimezone configuration property) to use a more specifc time zone value if you want to utilize time zone support.
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:129)
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:89)
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:63)
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:73)
	at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:76)
	at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:832)
	at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:456)
	at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:240)
	at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:207)
	at java.sql.DriverManager.getConnection(DriverManager.java:664)
	at java.sql.DriverManager.getConnection(DriverManager.java:247)
	at org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getConnectionToMetastore(HiveSchemaHelper.java:88)
	... 11 more  In my CentOS system I set the timezone to Europe/Berlin.  ls -l /etc/localtime
lrwxrwxrwx. 1 root root 35 Sep 25 10:45 /etc/localtime -> ../usr/share/zoneinfo/Europe/Berlin
 
 timedatectl | grep -i 'time zone'
       Time zone: Europe/Berlin (CEST, +0200)
  Does anyone know how to solve this problem? Thank you! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		09-14-2018
	
		
		10:27 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Felix Albani Thank you for your help! Without the LIMIT clause, the Job works perfectly (and in parallel). 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













