Member since
08-01-2017
19
Posts
0
Kudos Received
0
Solutions
07-27-2018
04:35 PM
I want spark api which can write data into excel file not in CSV file. Best solution to write data into excel file directly.
... View more
Labels:
- Labels:
-
Apache Spark
08-01-2017
06:35 AM
Can somebody tell me what would be the real time cluster configuration? I have setup Hortonworks in my home system however it is standalone in real time project what would be the cluster configuration like how many nodes, Cluster memory & RAM, Node memory & RAM, backup of cluster and all? And while submitting Spark job to YARN how can we decide executors, memory and all those properties?
... View more
Labels:
- Labels:
-
Apache YARN
07-10-2017
02:21 PM
First of all Thanks Geoffrey for your quick response, hope I have addressed you name correctly. Suppose I have one CSV file, I want to process it through SPARK , I submit it on YARN and I need data to be loaded in HIVE tables . In this case where would I write my spark code(I will code write in eclipse however on which machine?), how would I submit it on YARN and How would I access my hive tables, all components would be distributed? or SPARK and HIVE would be on same node? If they are on same node then why do we need other 3 data nodes if one edge node can do all stuff
... View more
07-10-2017
07:06 AM
Hello, I have trained myself on hadoop, I know how to work with MR,Pig,HIVE,SPARK,SCALA,SQOOp and all however I worked on all these components in my personal system and in singlenode architecture. Now I need to know that how real time LIVE project works? How multi node structure works? If I am trying to process one CSV file then How do I access spark and hive and all which are installed on different nodes? And How do I access those? I need detailed documents if somebody have or any article that anyone is aware of which shows complete steps and process to access different components. I feel helpless as nobody in my group or in my connection works on real time Hadoop ecosystem
... View more
Labels:
06-30-2017
05:07 PM
Thanks , it helped a lot to clear my confusion.
... View more
06-29-2017
01:39 PM
I have gone through below URL to understand how to load data into HIVE using spark in orc format. I understood how to create table in HIVE using spark howvere I have one question that how would spark identify that in which database this table should be created or if I have same table name in two different HIVE DB in which table spark is going to insert values I have gone through below URL: https://hortonworks.com/tutorial/using-hive-with-orc-from-apache-spark/
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark