Member since
01-27-2016
48
Posts
9
Kudos Received
0
Solutions
10-07-2016
08:10 AM
I am working on a aws dataset(email dataset -enron) . I just wanted to do a word count on all of the emails and find out the average. The files are zipped (Please see the screen shot attachment which shows how the actual data set looks like). Please if some one could help me by looking at the scrscreen-shot-2016-10-07-at-090457.pngeen shot that how I can do the word count processing using spark (scala preferably). I would really appreciate . Note: The actual datasize is 210 GB. I am planning to run an EMR cluster then perform the processing.
... View more
Labels:
- Labels:
-
Apache Spark
09-06-2016
02:30 PM
@bpreachuk and @Constantin Stanca Thanks a lot for resolving this.
... View more
09-06-2016
08:21 AM
@ Constantin Stanca : Thanks a lot for the help but Its is still giving semantic exception Invalid table alias or column reference 's': possible column names are emp_no, salary, from_date, to_date
... View more
09-02-2016
09:55 PM
@Daniel Kozlowski Please see attached screen grab screen-shot-2016-09-02-at-225412.png
... View more
09-02-2016
09:53 PM
@Daniel Kozlowski : It's still gives an exception "Expression not in Group By key emp_no"
... View more
09-02-2016
09:34 PM
> Not sure can use select min(from_date) clause create table employees2_final stored as ORC tblproperties (‘orc.compress’=‘SNAPPY’)
AS
SELECT
e.emp_no, e.birth_date,e.first_name, e.last_name, e.gender, select min(s.from_date) from new2_salaries s
GROUP BY s.emp_no
WHERE s.emp_no = e.emp_no as hire_date_new from employees e.
... View more
Labels:
- Labels:
-
Apache Hive
04-27-2016
10:42 AM
It worked Thanks
... View more
04-27-2016
08:21 AM
@Alessio Ubaldi : Thanks the query has started but its still failing. I guess I am still missing something. screen-shot-2016-04-27-at-092033.png
... View more
04-27-2016
07:29 AM
Please find the attached for the error screen grabscreen-shot-2016-04-27-at-082448.png Sqoop Query : sqoop import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://sandbox.hortonworks.com/test --username root -P --m 1 --query "select empid, ename from emp WHERE '$CONDITIONS' AND eid > '200' " --target-dir /user/maria_dev
... View more
Labels:
- Labels:
-
Apache Sqoop