Member since
08-15-2017
31
Posts
29
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1841 | 01-30-2018 07:47 PM | |
724 | 08-31-2017 02:05 AM | |
616 | 08-25-2017 05:35 PM |
03-29-2018
03:49 PM
I'm trying to read and store messages from a kafka topic using Spark Structured Streaming. The records read are in df. The below code shows zero records. If i replace the format with format("console"), i'm able to see the records being printed on console. StreamingQuery initDF = df.writeStream()
.outputMode("append")
.format("memory")
.queryName("initDF")
.trigger(Trigger.ProcessingTime(1000))
.start();
sparkSession.sql("select * from initDF").show();
initDF.awaitTermination();
... View more
Labels:
- Labels:
-
Apache Spark
01-30-2018
07:47 PM
@Viswa Add this configuration in your pom.xml under build tag, rather than adding jar in spark-submit. <descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
... View more
12-01-2017
06:39 PM
@Sai Sandeep Can you please share the HS2 logs from /var/log/hive/hiveserver2.log.
... View more
11-16-2017
02:59 PM
2 Kudos
@Abhay Singh The tutorial is using HDF-2.1.0 which is an older version. You can download this from the archive section on this page: https://hortonworks.com/downloads/#sandbox
... View more
11-13-2017
05:53 PM
@Dinesh Chitlangia That helped. Thank you..!!
... View more
11-13-2017
05:40 PM
1 Kudo
I am using Spark 1.6. I get the following error when I try to read a json file as per documentation of Spark 1.6 scala> val colors = sqlContext.read.json("C:/Downloads/colors.json");
colors: org.apache.spark.sql.DataFrame = [_corrupt_record: string] 1. The Spark 1.6 works fine and I have been able to read other text files. 2. Attached the json file as text(since upload of json is blocked), I have also validated the json using jsoneditoronline to ensure the json file is well formulated colors.txt
... View more
Labels:
- Labels:
-
Apache Spark
11-03-2017
07:42 PM
1 Kudo
@manikandan ayyasamy
The same scenario is solved here:
https://community.hortonworks.com/questions/134841/hive-convert-all-values-for-a-column-to-a-comma-se.html?childToView=134850#comment-134850
Hope this helps.
... View more
10-27-2017
03:52 AM
2 Kudos
I am following this link : https://cwiki.apache.org/confluence/display/AMBARI/Enabling+HDFS+per-user+Metrics However, I am still not able to view the metrics in Grafana. Is there a different set of settings that we need to do for a HA environment?
... View more
Labels:
- Labels:
-
Apache Hadoop
10-16-2017
10:45 PM
Thank you @Dinesh Chitlangia. This solved the issue.
... View more
10-16-2017
09:43 PM
@Dinesh Chitlangia I had verified all the listed know issues mentioned in the above URL. Still the problem exists.
... View more
10-16-2017
09:37 PM
2 Kudos
I just upgraded from HDP-2.6.1 to HDP-2.6.2 Both are kerberizzed clusters and doAs=True in Hive. In 2.6.1, jdbc hive interpreter was working fine.After upgrading, even simple queries like 'Show Databases' results in error in doAs. My configurations are same in both the versions. org.apache.zeppelin.interpreter.InterpreterException: Error in doAs at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:415) at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:633) at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:733) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:101) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:502) at org.apache.zeppelin.scheduler.Job.run(Job.java:175) at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1884) at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnection(JDBCInterpreter.java:407) ... 13 more Caused by: java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: java.net.ConnectException: Connection refused at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:218) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:156) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:208) at org.apache.commons.dbcp2.DriverManagerConnectionFactory.createConnection(DriverManagerConnectionFactory.java:79) at org.apache.commons.dbcp2.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:205) at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:861) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:435) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:363) at org.apache.commons.dbcp2.PoolingDriver.connect(PoolingDriver.java:129) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:270) at org.apache.zeppelin.jdbc.JDBCInterpreter.getConnectionFromPool(JDBCInterpreter.java:362) at org.apache.zeppelin.jdbc.JDBCInterpreter.access$000(JDBCInterpreter.java:89) at org.apache.zeppelin.jdbc.JDBCInterpreter$1.run(JDBCInterpreter.java:410) at org.apache.zeppelin.jdbc.JDBCInterpreter$1.run(JDBCInterpreter.java:407) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) ... 14 more Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused at org.apache.thrift.transport.TSocket.open(TSocket.java:185) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssuming
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Zeppelin
10-16-2017
09:30 PM
@Shu Thank you for the explanation. I wanted to know Hive queries (Hive sql) where there is no reducer phase at all, only mapper phase. Is there such an example ?
... View more
10-14-2017
05:06 PM
I'm looking for Hive query scenarios, where it uses only mappers or only reducers.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
09-15-2017
09:50 PM
Thank you @Dinesh Chitlangia !! That helped.
... View more
09-15-2017
08:17 PM
3 Kudos
In the below query, I want to force a column value to null by casting it to required datatype. This works with usual sql. But Hive is throwing the exception. Select * from(select driverid, name, ssn from drivers where driverid<15
UNION ALL
Select driverid,name, cast(null as bigint) from drivers where driverid BETWEEN 18 AND 21) T
;
SemanticException 3:48 Schema of both sides of union should match. T-subquery2 does not have the field ssn. Error encountered near token 'drivers'.
... View more
Labels:
- Labels:
-
Apache Hive
08-31-2017
02:05 AM
3 Kudos
Just verified - HDP 2.6.1 has Hadoop/HDFS 2.7.3 https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_release-notes/content/comp_versions.html The issue you have reported is a bug that has been fixed in HDFS 2.8.0 as per this Apache JIRA: https://issues.apache.org/jira/browse/HDFS-8805
... View more
08-28-2017
07:49 PM
Thank you @Nandish B Naidu..!! The solution worked.
... View more
- Tags:
- thank you
- that worked
08-27-2017
08:05 PM
Thank You @Sonu Sahi for the solution.
... View more
08-27-2017
08:04 PM
Thank you @Sindhu for the elaborate explanation and the solution..!!
... View more
08-25-2017
05:35 PM
3 Kudos
The way Ambari has been designed is that during the upgrade it will backup all paths listed under dfs.namenode.name.dir Thus, currently there is no feature that let's you select a particular file to backup instead of all. This could be a new feature!
... View more
08-24-2017
10:12 PM
1 Kudo
Thank you..!! It worked.
... View more
08-24-2017
09:42 PM
1 Kudo
My dataset set is as below: 700,Angus (1995),Comedy 702,Faces (1968),Drama 703,Boys (1996),Drama 704,"Quest, The (1996)",Action|Adventure 705,Cosi (1996),Comedy 747,"Stupids, The (1996)",Comedy Create table movies(
movieid int,
title String,
genre string)
row format delimited
fields terminated by ','
;
Select title , genre from movies;
Since the rows have comma separated values, the records like 704,"Quest, The (1996)",Action|Adventure returns Quest as title, instead of Quest,The (1996). And Genre value is shown as The(1996) instead of Action|Adventure. How to load such data correctly by escaping the delimiter in the value ?
... View more
Labels:
- Labels:
-
Apache Hive
08-24-2017
08:28 PM
4 Kudos
Select name, city from people; The above query results: jon Atlanta jon Newyork snow LA snow DC But i want the result as a single row as follows: jon Atlanta,Newyork snow LA,DC
... View more
Labels:
- Labels:
-
Apache Hive
08-16-2017
05:12 PM
@Andres Urrego Thank you..!! Yes. But when we are using only 'Import' instead of 'Import-all-tables', we can specify where to store the table, instead of the default HDFS path using --target-dir . As per below code, the data is imported to the target directory specified. sqoop import --connect jdbc:mysql://localhost/my_db --table EMP --username root --target-dir 'sqoopdata/emp' -m 1
... View more
08-15-2017
11:23 PM
4 Kudos
While trying to use import-all-tables if i specify the target-dir , i'm getting the below error. $ sqoop import-all-tables --connect jdbc:mysql://localhost/db --username root --target-dir 'alltables/data' -m 1
17/08/14 14:08:07 ERROR tool.BaseSqoopTool: Error parsing arguments for import-all-tables:
17/08/14 14:08:07 ERROR tool.BaseSqoopTool: Unrecognized argument: --target-dir
17/08/14 14:08:07 ERROR tool.BaseSqoopTool: Unrecognized argument: alltables/data
17/08/14 14:08:07 ERROR tool.BaseSqoopTool: Unrecognized argument: -m
17/08/14 14:08:07 ERROR tool.BaseSqoopTool: Unrecognized argument: 1 Appreciate any help.!!
... View more
Labels:
- Labels:
-
Apache Sqoop
08-15-2017
11:23 PM
2 Kudos
I posted my first question. I'm getting the following message. This post is currently awaiting moderation. If you believe this to be in error, contact a system administrator. How to contact system administrator ? Appreciate any help.!
... View more