Member since
09-28-2017
34
Posts
2
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1413 | 10-25-2017 02:24 PM | |
686 | 10-16-2017 07:55 AM | |
4472 | 10-03-2017 04:36 AM |
11-13-2017
03:21 PM
@Abdelkrim Hadjidj I tried this new version and when I load spark app using Yarn Cluster, I see the that the spark version is still: 2.1.1.2.6.2.0-205 and not 2.2... What am I doing wrong?
... View more
10-25-2017
02:24 PM
Found the issue. Turns out that since I run my spark application with --master yarn I should remove this part from the SparkSession builder: .master(masterUrl)
... View more
10-25-2017
08:54 AM
@Aditya Sirna I suspect my issue is different. I have 3 healthyyarn-cluster.jpg nodes. see attached.
... View more
10-25-2017
08:21 AM
Diagnostics: AM container is launched, waiting for AM container to Register with RM
... View more
10-25-2017
08:12 AM
I am trying to run a spark app in hdp cluster.
My app keeps failing on: ERROR ApplicationMaster: SparkContext did not initialize after waiting for 100000 ms. Please check earlier log output for errors. Failing the application.
In the previous lines the only error (which I see a few times in the log) is:
WARN DataSource: Error while looking for metadata directory.
Any ideas?
... View more
- Tags:
- spark2
- yarn-container
Labels:
10-25-2017
06:51 AM
@Aditya Sirna That didn't help. I now have 3 node managers. still it seems stuck in Accepted state. Note: under: http:/my-node:8042/node/allApplications I do see the container running and logs show the application IS running. So - even more strange that the app seems to stuck in Accepted mode for so long...
... View more
10-24-2017
02:21 PM
@Aditya Sirna Thanks. Is it a problem to install node manager on all 3 nodes? I think that the auto-install process installed it on one of the nodes automatically. One would think that if the auto install only installed on one node - that's the way it should be, no? Are 3 nodes enough for sandboxing or should we have at least 5 as recommended for zookeeper master election?
... View more
10-24-2017
02:13 PM
I installed HDP on 3 nodes and it seems that Yarn is running only in a single node and spark applications also run on 1 node only and work is not distributed on all nodes. Where can I look to understand the issue?
... View more
Labels:
10-24-2017
12:57 PM
1 Kudo
I am trying to run a spark application that runs fine in local mode. I am running like this: /usr/hdp/2.6.2.0-205/spark2/bin/spark-submit --class MyMain \
--master yarn \
--deploy-mode cluster \
--executor-memory 2G \
--num-executors 10 \
framework-1.0.0-0-all.jar
But it takes forever to start and in hadoop application UI I see this status: YarnApplicationState:ACCEPTED: waiting for AM container to be allocated, launched and register with RM In the console I see this line every second for over 10 minutes: 17/10/24 15:57:30 INFO Client: Application report for application_1508848914801_0003 (state: ACCEPTED)
Any ideas?
... View more
Labels:
10-18-2017
09:13 AM
@bkosaraju I tried running this in hive ui. Only error in log is: 2017-10-18 11:55:34,252 ERROR [Atlas Logger 3]: metadata.Hive (Hive.java:getTable(1215)) - Table values__tmp__table__1 not found: default.values__tmp__table__1 table not found
2017-10-18 11:55:34,252 ERROR [Atlas Logger 3]: hook.HiveHook (HiveHook.java:run(205)) - Atlas hook failed due to error
java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1884)
at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:195)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.atlas.hook.AtlasHookException: HiveHook.registerProcess() failed.
at org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:701)
at org.apache.atlas.hive.hook.HiveHook.collect(HiveHook.java:268)
at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:83)
at org.apache.atlas.hive.hook.HiveHook$2$1.run(HiveHook.java:198)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
... 6 more
Caused by: org.apache.atlas.hook.AtlasHookException: HiveHook.processHiveEntity() failed.
at org.apache.atlas.hive.hook.HiveHook.processHiveEntity(HiveHook.java:731)
at org.apache.atlas.hive.hook.HiveHook.registerProcess(HiveHook.java:668)
... 12 more
Caused by: org.apache.atlas.hook.AtlasHookException: HiveHook.createOrUpdateEntities() failed.
at org.apache.atlas.hive.hook.HiveHook.createOrUpdateEntities(HiveHook.java:597)
at org.apache.atlas.hive.hook.HiveHook.processHiveEntity(HiveHook.java:711)
... 13 more
Caused by: org.apache.atlas.hook.AtlasHookException: HiveHook.createOrUpdateEntities() failed.
at org.apache.atlas.hive.hook.HiveHook.createOrUpdateEntities(HiveHook.java:589)
at org.apache.atlas.hive.hook.HiveHook.createOrUpdateEntities(HiveHook.java:595)
Not sure it's related. In Hive UI it runs much faster. So maybe - this is an issue with how hive-jdbc ineracts with hive server? I don't see anything wrong in resource ui In Tez UI for example, I see this query: SELECT COUNT(*) FROM sandbox.t_14 Took more than 4 minutes. The table has 1000 rows only...
... View more
10-18-2017
06:37 AM
I have this table (copying DDL from Hive2 view): CREATE TABLE `number_generator`(
`last_number` int)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://hdfs_host:8020/apps/hive/warehouse/sandbox.db/number_generator'
TBLPROPERTIES (
'COLUMN_STATS_ACCURATE'='{\"BASIC_STATS\":\"true\"}',
'last_modified_by'='admin',
'last_modified_time'='1508308017',
'numFiles'='1',
'numRows'='1',
'rawDataSize'='2',
'totalSize'='3',
'transient_lastDdlTime'='1508308020')
Each time I want to use this table for number generation I do the following: TRUNCATE TABLE sandbox.number_generator This finishes quite quickly. After this I run: INSERT INTO sandbox.number_generator (last_number) VALUES (4) This take over 10 minutes ti complete. Any ideas why?
... View more
Labels:
10-16-2017
08:24 AM
the comment helped not the original answer - but I can't mark a comment with 'accept'
... View more
10-16-2017
07:55 AM
Turns out that the user password property names sent to the hive driver in connProps: Connection connection = hiveDriver.connect(serverURL, connProps); were wrong. so the user and password info wasn't sent out. once fixed - the query works.
... View more
10-16-2017
07:20 AM
@Aditya Sirna Thing is I don't know where this 'anonymous' came from (as I am using admin user) and I don't even have a folder named /user/anonymous
... View more
10-16-2017
05:15 AM
I am running a simple select query that fails on some strange permission issue. I am connecting with specific user and password. see log attached.hive-jdbc-permissions-error.txt
... View more
Labels:
10-15-2017
01:33 PM
In the hive node I don't have a jdbc folder there: This is what I have: drwxr-xr-x 7 root root 4096 Sep 27 13:31 ./
drwxr-xr-x 35 root root 4096 Sep 27 15:16 ../
drwxr-xr-x 3 root root 4096 Sep 27 13:28 bin/
lrwxrwxrwx 1 root root 23 Sep 27 13:31 conf -> /etc/hive/2.6.2.0-205/0/
drwxr-xr-x 3 root root 4096 Sep 27 13:28 doc/
-rw-r--r-- 1 root root 106117252 Aug 26 12:31 hive.tar.gz
drwxr-xr-x 5 root root 12288 Sep 27 15:25 lib/
drwxr-xr-x 2 root root 4096 Sep 27 13:28 man/
drwxr-xr-x 3 root root 4096 Sep 27 13:28 scripts/
... View more
10-15-2017
10:19 AM
Thanks. Is this is for simba specifically? My app is a jdbc based app - and I get the jars using maven and just set the driver in config and expect the code to just work. From the zip for jdbc 4.1 not clear which version of hive-jdbc is used? For example this zip uses the HS1Driver while I thought I should use: org.apache.hive.jdbc.HiveDriver
... View more
10-15-2017
08:26 AM
I tried the latest I find in maven repo - but it didn't work. the only version I got working is hive-jdbc-2.0.0
... View more
Labels:
10-03-2017
12:58 PM
I have an existing app that interacts with jdbc. When connecting it to hive2 (using hive-jdbc-2.0.0 and hdp 7.x) I get SQLException("Method not supported") on calling commit. What is a plausible workaround for this?
... View more
Labels:
10-03-2017
04:42 AM
@Shu That that the initial loads works for me - what are the parquet format pros and cons? Is the creating new table and copying data from the text based table into it worth the trouble?
... View more
10-03-2017
04:36 AM
@Jay SenSharma @Shu So - now it works. I just connected to the admin user instead of hive user and now it all works fine. Thanks for you help.
... View more
10-02-2017
01:10 PM
@Shu hadoop fs -ls /apps/hive/ Yields Found 1 items
drwxrwxrwx - hive hadoop 0 2017-10-02 14:00 /apps/hive/warehouse You mean the Load statement and not the insert, right? Attached is the error and last lines from log.
... View more
10-02-2017
11:24 AM
@Geoffrey Shelton Okot the root password change was just something I had to do as the mysql was not letting me in with the default credentials The add user for hive/change privileges, as they suggest in the above mentioned link, really solved the issue
... View more
10-02-2017
08:53 AM
@Shu the ls output: Found 1 items
drwxrwxrwx - hive hadoop 0 2017-10-01 18:01 /apps/hive/warehouse/csvdemo
which of these do you need? drwxr-xr-x 2 hive hadoop 4096 Oct 2 00:00 ./
drwxrwxr-x 40 root syslog 4096 Oct 2 06:25 ../
-rw-r--r-- 1 hive hadoop 0 Oct 1 13:51 atlas_hook_failed_messages.log
-rw-r--r-- 1 hive hadoop 312 Oct 1 14:36 hive.err
-rw-r--r-- 1 hive hadoop 581226 Oct 2 11:51 hivemetastore.log
-rw-r--r-- 1 hive hadoop 290424 Sep 28 23:57 hivemetastore.log.2017-09-28
-rw-r--r-- 1 hive hadoop 1195294 Sep 29 23:57 hivemetastore.log.2017-09-29
-rw-r--r-- 1 hive hadoop 1173882 Sep 30 23:57 hivemetastore.log.2017-09-30
-rw-r--r-- 1 hive hadoop 1309208 Oct 1 23:57 hivemetastore.log.2017-10-01
-rw-r--r-- 1 hive hadoop 31 Oct 1 14:36 hive.out
-rw-r--r-- 1 hive hadoop 112944 Oct 1 18:02 hive-server2.err
-rw-r--r-- 1 hive hadoop 1189192 Oct 2 11:51 hiveserver2.log
-rw-r--r-- 1 hive hadoop 32095865 Sep 27 16:30 hiveserver2.log.2017-09-27
-rw-r--r-- 1 hive hadoop 88093181 Sep 28 23:57 hiveserver2.log.2017-09-28
-rw-r--r-- 1 hive hadoop 2368106 Sep 29 23:57 hiveserver2.log.2017-09-29
-rw-r--r-- 1 hive hadoop 2368598 Sep 30 23:57 hiveserver2.log.2017-09-30
-rw-r--r-- 1 hive hadoop 23657102 Oct 1 23:57 hiveserver2.log.2017-10-01
-rw-r--r-- 1 hive hadoop 0 Oct 1 14:37 hive-server2.out
... View more
10-01-2017
03:02 PM
@Shu I run hadoop fs -chmod 777 /apps/hive/warehouse/csvdemo Still ,same error in my java client: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
... View more
10-01-2017
02:07 PM
Thanks @Shu. Is the parquet worth the double load? I mean If I run just the first load it seems to work fine - so what am I loosing here? Also - I have an issue running this load from java client (permissions issue?) java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:283) ~[hive-jdbc-2.0.0.jar:2.0.0]
... View more
10-01-2017
01:47 PM
I created a table using java client: CREATE TABLE csvdemo (id Int, name String, email String) STORED AS PARQUET I use the java hadoop file system to copy the csv file from local into hdfs When I run this load command it looks successful (running from ambari): load data inpath '/user/admin/MOCK_DATA.csv' into table csvdemo; But when I try to read from it using: select * from csvdemo limit 1; I get this error: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.RuntimeException: hdfs://my-host:8020/apps/hive/warehouse/csvdemo/MOCK_DATA.csv is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [103, 111, 118, 10]
org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.RuntimeException: hdfs://my-host:8020/apps/hive/warehouse/csvdemo/MOCK_DATA.csv is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [103, 111, 118, 10]
at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:264)
at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:250)
at org.apache.hive.jdbc.HiveQueryResultSet.next(HiveQueryResultSet.java:373)
at org.apache.ambari.view.hive20.actor.ResultSetIterator.getNext(ResultSetIterator.java:119)
at org.apache.ambari.view.hive20.actor.ResultSetIterator.handleMessage(ResultSetIterator.java:78)
at org.apache.ambari.view.hive20.actor.HiveActor.onReceive(HiveActor.java:38)
at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:167)
at akka.actor.Actor$class.aroundReceive(Actor.scala:467)
at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:97)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.RuntimeException: hdfs://my-host:8020/apps/hive/warehouse/csvdemo/MOCK_DATA.csv is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [103, 111, 118, 10]
at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:414)
at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:233)
at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:784)
at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
at com.sun.proxy.$Proxy29.fetchResults(Unknown Source)
at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:520)
at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:709)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1557)
at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1542)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.lang.RuntimeException: hdfs://my-host:8020/apps/hive/warehouse/csvdemo/MOCK_DATA.csv is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [103, 111, 118, 10]
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:520)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:427)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1765)
at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:409)
... 24 more
Caused by: java.lang.RuntimeException: hdfs://my-host:8020/apps/hive/warehouse/csvdemo/MOCK_DATA.csv is not a Parquet file. expected magic number at tail [80, 65, 82, 49] but found [103, 111, 118, 10]
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:423)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:386)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:372)
at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:255)
at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:97)
at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:83)
at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:71)
at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:694)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:332)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:458)
... 28 more
... View more
Labels:
10-01-2017
01:06 PM
I am using horton version HDP-2.6.2.0 I am trying to connect to hive using java client. I fail on: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null, configuration:{use:database=default})
at org.apache.thrift.TApplicationException.read(TApplicationException.java:111) ~[libthrift-0.9.3.jar!/:0.9.3]
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79) ~[libthrift-0.9.3.jar!/:0.9.3]
at org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:168) ~[hive-service-rpc-2.1.1.jar!/:2.1.1]
at org.apache.hive.service.rpc.thrift.TCLIService$Client.OpenSession(TCLIService.java:155) ~[hive-service-rpc-2.1.1.jar!/:2.1.1]
at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:576) ~[hive-jdbc-2.1.1.jar!/:2.1.1]
... 15 common frames omitted
<br> The hive-jdbc version as you can see is 2.1.1. I read this may be a mismatch between client and server versions. I couldn't find anywhere what is the correct mapping to my server version. It seems to work with hive-jdbc 2.0.0 - but that's quite old. How can I use latest?
... View more
Labels: