About rmy1712

rmy1712 · ‎03-29-2018

I'm trying to read and store messages from a kafka topic using Spark Structured Streaming. The records read are in df. The below code shows zero records. If i replace the format with format("console"), i'm able to see the records being printed on console. StreamingQuery initDF = df.writeStream() .outputMode("append") .format("memory") .queryName("initDF") .trigger(Trigger.ProcessingTime(1000)) .start(); sparkSession.sql("select * from initDF").show(); initDF.awaitTermination();

rmy1712 · ‎01-30-2018

@Viswa Add this configuration in your pom.xml under build tag, rather than adding jar in spark-submit. <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs>

rmy1712 · ‎11-13-2017

@Dinesh Chitlangia That helped. Thank you..!!

rmy1712 · ‎11-13-2017

I am using Spark 1.6. I get the following error when I try to read a json file as per documentation of Spark 1.6 scala> val colors = sqlContext.read.json("C:/Downloads/colors.json"); colors: org.apache.spark.sql.DataFrame = [_corrupt_record: string] 1. The Spark 1.6 works fine and I have been able to read other text files. 2. Attached the json file as text(since upload of json is blocked), I have also validated the json using jsoneditoronline to ensure the json file is well formulated colors.txt

rmy1712 · ‎11-03-2017

@manikandan ayyasamy The same scenario is solved here: https://community.hortonworks.com/questions/134841/hive-convert-all-values-for-a-column-to-a-comma-se.html?childToView=134850#comment-134850 Hope this helps.

rmy1712 · ‎10-27-2017

I am following this link : https://cwiki.apache.org/confluence/display/AMBARI/Enabling+HDFS+per-user+Metrics However, I am still not able to view the metrics in Grafana. Is there a different set of settings that we need to do for a HA environment?

rmy1712 · ‎10-16-2017

Thank you @Dinesh Chitlangia. This solved the issue.

rmy1712 · ‎10-16-2017

@Dinesh Chitlangia I had verified all the listed know issues mentioned in the above URL. Still the problem exists.

rmy1712 · ‎10-16-2017

I just upgraded from HDP-2.6.1 to HDP-2.6.2 Both are kerberizzed clusters and doAs=True in Hive. In 2.6.1, jdbc hive interpreter was working fine.After upgrading, even simple queries like 'Show Databases' results in error in doAs. My configurations are same in both the versions. org.apache.zeppelin.interpreter.InterpreterException: Error in doAs at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:415) at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:633) at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:733) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:101) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:502) at org.apache.zeppelin.scheduler.Job.run(Job.java:175) at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1884) at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:407) ... 13 more Caused by: java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: java.net.ConnectException: Connection refused at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:218) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:156) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:208) at org.apache.commons.dbcp2.DriverManagerConnectionFactory.createConnection(DriverManagerConnectionFactory.java:79) at org.apache.commons.dbcp2.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:205) at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363) at org.apache.commons.dbcp2.PoolingDriver.connect(PoolingDriver.java:129) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:270) at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnectionFromPool(JDBCInterpreter.java:362) at org.apache.zeppelin.jdbc.JDBCInterpreter.access$000(JDBCInterpreter.java:89) at org.apache.zeppelin.jdbc.JDBCInterpreter$1.run(JDBCInterpreter.java:410) at org.apache.zeppelin.jdbc.JDBCInterpreter$1.run(JDBCInterpreter.java:407) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) ... 14 more Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused at org.apache.thrift.transport.TSocket.open(TSocket.java:185) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssuming

rmy1712 · ‎10-16-2017

@Shu Thank you for the explanation. I wanted to know Hive queries (Hive sql) where there is no reducer phase at all, only mapper phase. Is there such an example ?

Online	Offline
Last Visited	‎02-27-2019 07:07 PM

Member Since	‎08-15-2017 10:43 PM
Last Visited	‎02-27-2019 07:07 PM
Posts	31
Kudos received	27

Cloudera Community

Re: Kakfa Spark Streaming Error

Re: HDFS Storage Policies : Unable to get storage ...

Re: Ambari/HDP Upgrade : Can we force to backup on...

Spark Structured streaming : format("memory") is s...

Re: Kakfa Spark Streaming Error

Re: Spark 1.6 - Dataframe read json throws org.apa...

Spark 1.6 - Dataframe read json throws org.apache....

Re: How to merge two rows having same values into ...

Unable to see HDFS metrics in Grafana

Re: Zeppelin JDBC Interpreter: Hive Error in doAs

Re: Zeppelin JDBC Interpreter: Hive Error in doAs

Zeppelin JDBC Interpreter: Hive Error in doAs

Re: Hive queries use only mappers or only reducers