Member since
06-19-2018
1
Post
0
Kudos Received
0
Solutions
06-19-2018
02:00 AM
1 Kudo
You did not mentioned the version of CDH. But I think the problem is that spark launches many executors to read, and those executors are not co-located with the Kudu tablet servers. I dont know if you are just reading/filtering the data, or reading and writing into parquet - it depends how the spark job is executed. What I also noticed, that running multiple spark jobs agains the same table (with different partitions) did not help either.
... View more