About RussHockey2

RussHockey2 · ‎08-05-2017

I have a single node system just for doing minmial testing. I have this exact situation but my laptop only has 16 GB to provide. How do i set/configure the container memory to enable a job to run (get past the accepted state) ? Do i raise the container maximum to 16 GB? or Do I raise it to a value it can never provide like 64 GB?

RussHockey2 · ‎07-19-2016

Hello Vadim, Thank for identifying thew newer release of Atlas 0.7 I can see that you folks have been very busy creating new features - especially in the area of a Taxonomy. Great Job !!! So I have written java code which will be highlighting our (IBM's) ability to pull metadata from various sources, and this newer release seems primed for working with your release. A few observations : 1) While the Apache Atlas interface recognizes newly added entities - the lineage display seems broken or disabled. It only shows a flat line. Any ideas why? 2) Regardless of the Lineage display - my code to call the API with say: http://192.168.1.133:21000/api/atlas/lineage/hive/table/default.anewsample@Sandbox/inputs/graph returns me the expected JSON structure of edges/ and verticies and from each item in the edge i can further query details of the entity using : http://192.168.1.133:21000/api/atlas/entities/> Question: How do I now what is the correct order of the lineage? Bear in mind, that JSON returned results can never assure a given order. Any help to understanding the order problem? Regards, Russ : rga@us.ibm.com

RussHockey2 · ‎06-30-2016

The response provided gives me a good head start on how to utilize the REST API - thank you!!! The problem i am still having is how do I process the result of the call to inputs/graph - speicifically - the JSON has the vertices and the edges. {"results":{"typeName":"__tempQueryResultStruct401","values":{"edges":{"cb09a6f5-f76d-4208-8a62-17c3ca1e3e9d":["b0092981-6bcf-41a8-b0c9-34f3c5b6efd4"],"5591f85d-054a-4264-a8ee-c5ec250ccccb":["aff53ced-c48a-4b18-a321-4862a6ab84ac"],"6875ac4e-45a3-458a-b586-16e484567f9a":["cb09a6f5-f76d-4208-8a62-17c3ca1e3e9d"],"b0092981-6bcf-41a8-b0c9-34f3c5b6efd4":["5591f85d-054a-4264-a8ee-c5ec250ccccb"]},"vertices":{"aff53ced-c48a-4b18-a321-4862a6ab84ac":{"typeName":"__tempQueryResultStruct400","values":{"vertexId":{"typeName":"__IdType","values":{"guid":"aff53ced-c48a-4b18-a321-4862a6ab84ac","typeName":"sqoop_dbdatastore"},"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct"},"name":"mysql --url jdbc:mysql:\/\/localhost\/test?zeroDateTimeBehavior=convertToNull --table test_table_sqoop"},"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct"},"6875ac4e-45a3-458a-b586-16e484567f9a":{"typeName":"__tempQueryResultStruct400","values":{"vertexId":{"typeName":"__IdType","values":{"guid":"6875ac4e-45a3-458a-b586-16e484567f9a","typeName":"hive_table"},"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct"},"name":"default.russ@erietp"},"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct"},"b0092981-6bcf-41a8-b0c9-34f3c5b6efd4":{"typeName":"__tempQueryResultStruct400","values":{"vertexId":{"typeName":"__IdType","values":{"guid":"b0092981-6bcf-41a8-b0c9-34f3c5b6efd4","typeName":"hive_table"},"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct"},"name":"default.test_hive_table@erietp"},"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct"}}},"jsonClass":"org.apache.atlas.typesystem.json.InstanceSerialization$_Struct"},"requestId":"qtp1635546341-128 - c44ffa0a-4451-4661-846d-09a846a7bcfc","tableName":"default.russ@erietp"} how do i process the verticies and the edges ? what order do i do this ? This is not clear

RussHockey2 · ‎05-05-2016

Can someone give me a clue regarding the nature of the problem here with having a simple Hive query execute a query on Spark ? Do i have to extend the Spark Job Monitor time ? If so, how do i do that ? Any suggestions out there ? 6/05/05 10:15:00 INFO log.PerfLogger: <PERFLOG method=SparkSubmitToRunning from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor> 16/05/05 10:16:01 INFO status.SparkJobMonitor: Job hasn't been submitted after 61s. Aborting it. 16/05/05 10:16:01 ERROR status.SparkJobMonitor: Status: SENT 16/05/05 10:16:01 INFO log.PerfLogger: </PERFLOG method=SparkRunJob start=1462461300288 end=1462461361295 duration=61007 from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor> 16/05/05 10:16:01 ERROR exec.Task: Failed to execute spark task, with exception 'java.lang.IllegalStateException(RPC channel is closed.)' java.lang.IllegalStateException: RPC channel is closed. at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.hive.spark.client.rpc.Rpc.call(Rpc.java:276) at org.apache.hive.spark.client.rpc.Rpc.call(Rpc.java:259) at org.apache.hive.spark.client.SparkClientImpl$ClientProtocol.cancel(SparkClientImpl.java:523)

RussHockey2 · ‎04-29-2016

I have installed 5.7 Cloudera cluster and changed the one single paramameter : hive.execution.engine to Spark Then tried to execute an example query which resulted in the following error: 16/04/29 03:14:52 ERROR status.SparkJobMonitor: Status: SENT 16/04/29 03:14:52 INFO log.PerfLogger: </PERFLOG method=SparkRunJob start=1461917631790 end=1461917692802 duration=61012 from=org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor> 16/04/29 03:14:52 ERROR exec.Task: Failed to execute spark task, with exception 'java.lang.IllegalStateException(RPC channel is closed.)' java.lang.IllegalStateException: RPC channel is closed. at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.hive.spark.client.rpc.Rpc.call(Rpc.java:276) at org.apache.hive.spark.client.rpc.Rpc.call(Rpc.java:259) at org.apache.hive.spark.client.SparkClientImpl$ClientProtocol.cancel(SparkClientImpl.java:523) at org.apache.hive.spark.client.SparkClientImpl.cancel(SparkClientImpl.java:187) at org.apache.hive.spark.client.JobHandleImpl.cancel(JobHandleImpl.java:62) at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobRef.cancelJob(RemoteSparkJobRef.java:54) at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:119) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1774) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1531) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1311) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1113) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:178) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:72) at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:232) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:245) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 16/04/29 03:14:52 ERROR exec.Task: Failed to execute spark task, with exception 'java.lang.IllegalStateException(RPC channel is closed.)' java.lang.IllegalStateException: RPC channel is closed. at com.google.common.base.Preconditions.checkState(Preconditions.java:145) at org.apache.hive.spark.client.rpc.Rpc.call(Rpc.java:276) at org.apache.hive.spark.client.rpc.Rpc.call(Rpc.java:259) at org.apache.hive.spark.client.SparkClientImpl$ClientProtocol.cancel(SparkClientImpl.java:523) at org.apache.hive.spark.client.SparkClientImpl.cancel(SparkClientImpl.java:187) at org.apache.hive.spark.client.JobHandleImpl.cancel(JobHandleImpl.java:62) at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobRef.cancelJob(RemoteSparkJobRef.java:54) at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:119) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1774) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1531) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1311) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1113) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:178) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:72) at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:232) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:245) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 16/04/29 03:14:52 ERROR ql.Driver: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask 16/04/29 03:14:52 INFO log.PerfLogger: </PERFLOG method=Driver.execute start=1461917618369 end=1461917692836 duration=74467 from=org.apache.hadoop.hive.ql.Driver> 16/04/29 03:14:52 INFO ql.Driver: Completed executing command(queryId=hive_20160429031313_ef0fd500-f203-4f36-a1db-49b7b3efaf71); Time taken: 74.467 seconds 16/04/29 03:14:52 INFO log.PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver> 16/04/29 03:14:52 INFO log.PerfLogger: </PERFLOG method=releaseLocks start=1461917692838 end=1461917692845 duration=7 from=org.apache.hadoop.hive.ql.Driver> 16/04/29 03:14:52 ERROR operation.Operation: Error running hive query: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:374) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:180) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:72) at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:232) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:245) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)

RussHockey2 · ‎04-28-2016

Just need to update the hive-config.xml inside the sample itself, and works now. Can you please explain exactly what you did - please assume I am new to this. thank you ...

Online	Offline
Last Visited	‎09-27-2017 04:03 PM

Member Since	‎04-28-2016 08:23 AM
Last Visited	‎09-27-2017 04:03 PM
Posts	8
Kudos received	3

Cloudera Community

Re: JOB Stuck in Accepted State

Re: Atlas hive is not showing lineage data

Re: How can we get hive lineage data using REST AP...

Re: java.lang.IllegalStateException(RPC channel is...

java.lang.IllegalStateException(RPC channel is clo...

Re: Error When Executing Hive-Sample on CDH5.0.1