Member since
05-18-2016
2
Posts
0
Kudos Received
0
Solutions
06-23-2016
05:09 PM
We had a similar issue in the past doubling a cluster size by adding new nodes. We were stuck on an older Hadoop version without balancer optimizations. We solved the issue by temporarily doubling file replication factor, that triggered under-replication aggressive threashold. We did it in multiple steps during off-peak times. After replication completed we restored previous replication factors and the cluster was magically balanced 🙂
... View more
05-18-2016
12:31 PM
Thanks Timothy for this great article Are you aware of any method to go directly from the RDD of KafkaAvroDecoder output objects to a Spark Dataframe or Dataset specifying an Avro read schema? Stefano
... View more