Member since
10-04-2016
3
Posts
3
Kudos Received
0
Solutions
05-11-2017
04:14 PM
Ran into the same problem, resolved by enabling 'Hive Service' in Spark2.
... View more
10-12-2016
06:00 PM
Great, I'm glad the udf worked. As for the numpy issue, I'm not familiar enough with using numpy within spark to give any insights, but the workaround seems trivial enough. If you are looking for a more elegant solution, you may want to create a new thread and include the error. You may also want to take a look at sparks mllib statistics functions[1], though they operate across rows instead of within a single column. 1. http://spark.apache.org/docs/latest/mllib-statistics.html
... View more