Member since
07-24-2017
8
Posts
0
Kudos Received
0
Solutions
01-04-2018
07:02 PM
Makes sense. Thanks @Eugene Koifman
... View more
01-04-2018
05:06 PM
Thanks for the response Eugene Koifman. Also is there a way Spark can access those tables only in read mode by specifying the mode (READ only) or specify transaction as false on Spark side.
... View more
12-29-2017
09:48 PM
Hi, I am having the below problem when connecting from a Spark Program to Hive tables with Transaction = True. When a select * is executed on these tables, only the table meta data (columns) are displayed but not the records. Is this a limitation in Spark? Is there a way to get around this by setting Spark on a ready only mode on these tables? Any suggestions welcome. Thanks in advance. Note: The main reason for enabling Transaction=True for hive tables was, the PutHiveStreaming Processor of Nifi expected the table to be ACID Compliant for it to work. Now we put the data into Hive, but Spark is not able to read it.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
07-25-2017
09:16 PM
Thanks Marco. As we do not want to go back to RDBMS again, can I use Vertica or HBase or Presto as a structured/columnar database equivalent which saves the data post ETL processing from Spark. Would there be any performance benefits between these? Any suggestions?
... View more
07-24-2017
05:35 PM
Hi, I am new to Spark and I would like to ask a question on the use case I am trying to work upon. Plan is to use Hadoop/Spark as a reporting solution, fetching data from an RDBMS(Oracle) source system, perform ETL and execute report jobs using Spark SQL. The question is, can Spark be used for interactive report requests as well? For example a user requesting a report from a web application. Will the new Spark Structured Streaming be helpful to my case? Or should i get the ETL Output into a structured DB for interactive reports? Please suggest. Thanks in Advance.
... View more
Labels:
- Labels:
-
Apache Spark