Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

hive llap commands failing from spark-shell

avatar
Explorer

Anyone experienced similar issues?

we are using HDP 3.1

Connection details:


spark-shell --master yarn --driver-memory 7g --executor-memory 7g --conf spark.yarn.queue=SPARK --conf spark.sql.hive.hiveserver2.jdbc.url="jdbc:hive2:<servername>;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-interactive" spark.datasource.hive.warehouse.metastoreUri=thrift:<servername>:9083,thrift spark.hadoop.hive.zookeeper.quorum="<servername>:2181" spark.hadoop.hive.llap.daemon.service.hosts=@llap0 spark.security.credentials.hiveserver2.enabled=false --jars /usr/hdp/3.1.0.0-78/hive_warehouse_connector/hive-warehouse-connector-assembly-1.0.0.3.1.0.0-78.jar


commands

scala> val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build()

hive: com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl = com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl@194eae3e


scala> hive.showDatabases().show(100, false)

+------------------+

|database_name |

+------------------+

|default |

|information_schema|

|sandbox |

|sys |

+------------------+


The fact that showdatabases works fine suggests that it is able to connect to hive metastore without any issues.


But the moment we try any aggregate functions like below, its failing. Perhaps YARN/permissions playing a role here?


scala> val df = hive.executeQuery("select * from sandbox.pwr_iso_hourly_node_data_orc_sm")

df: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [iso_cd: string, node_cd: string ... 8 more fields]


scala> df.count()

res1: Long = 16481152


scala> val df2 = df.groupBy("iso_cd").count().orderBy("iso_cd")

df2: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [iso_cd: string, count: bigint]


scala> df2.count()

19/08/08 15:27:55 WARN TaskSetManager: Stage 2 contains a task of very large size (452 KB). The maximum recommended task size is 100 KB.

[Stage 2:> (0 + 1) / 1]19/08/08 15:27:56 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 101, hdpnedevbal802.subdomain.example.com, executor 1): java.io.IOException: Received reader event error: Received submission failed event for fragment ID attempt_8500257554648810195_0047_0_00_000000_0: java.lang.RuntimeException: Failed to submit: attempt_8500257554648810195_0047_0_00_000000_0

at org.apache.hadoop.hive.llap.LlapBaseRecordReader.failOnInterruption(LlapBaseRecordReader.java:178)

at org.apache.hadoop.hive.llap.LlapArrowBatchRecordReader.next(LlapArrowBatchRecordReader.java:79)

at org.apache.hadoop.hive.llap.LlapArrowBatchRecordReader.next(LlapArrowBatchRecordReader.java:37)

at com.hortonworks.spark.sql.hive.llap.HiveWarehouseDataReader.next(HiveWarehouseDataReader.java:75)

at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:49)

at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)

at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.datasourcev2scan_nextBatch_0$(Unknown Source)

at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.agg_doAggregateWithKeys_0$(Unknown Source)

at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)

at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)

at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)

at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)

at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)

at org.apache.spark.scheduler.Task.run(Task.scala:109)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.nio.channels.ClosedByInterruptException

at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)

at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:387)

at org.apache.arrow.vector.ipc.ReadChannel.readFully(ReadChannel.java:57)

at org.apache.arrow.vector.ipc.message.MessageChannelReader.readNextMessage(MessageChannelReader.java:56)

at org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeSchema(MessageSerializer.java:104)

at org.apache.arrow.vector.ipc.ArrowStreamReader.readSchema(ArrowStreamReader.java:128)

at org.apache.arrow.vector.ipc.ArrowReader.initialize(ArrowReader.java:181)

at org.apache.arrow.vector.ipc.ArrowReader.ensureInitialized(ArrowReader.java:172)

at org.apache.arrow.vector.ipc.ArrowReader.prepareLoadNextBatch(ArrowReader.java:211)

at org.apache.arrow.vector.ipc.ArrowStreamReader.loadNextBatch(ArrowStreamReader.java:103)

at org.apache.hadoop.hive.llap.LlapArrowBatchRecordReader.next(LlapArrowBatchRecordReader.java:63)

... 18 more


as per another forum, we made the priority of the LLAP queue larger than (1) the rest of the queues (0). But no luck.


Any suggestions appreciated.

2 REPLIES 2

avatar
New Contributor

I am also getting same issue when running a join between two tables (loaded using hive warehouse connector into data frames and then registered as spark temp tables  ) in spark SQL.

Could you please let me know if any solution

 

avatar
New Contributor

did you find any solution to this problem ?
I am facing similar error :
java.io.IOException: Received reader event error: Received task killed event for task ID attempt_1919285194944295157_12206_0_00_000055_3
In yarn logs :

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 55 in stage 1.0 failed 4 times, most recent failure:....
	at org.apache.hadoop.hive.llap.LlapBaseRecordReader.failOnInterruption(LlapBaseRecordReader.java:178)
	at org.apache.hadoop.hive.llap.LlapArrowBatchRecordReader.next(LlapArrowBatchRecordReader.java:79)