I have a relatively large CSV (~80GB) I need to transform into multiple JSON documents/records. I'm using a ConvertRecord processor with a CSVReader and AvroRecordSetWriter and that's where my CSV gets stuck. What's the best approach? Break up the CSV prior to converting it or try to get more horsepower on the server?
- Server Mem: 16GB
- Cores: 4
- Maximum Timer Driven Thread Count : 16
- Java Min/Max Heap: 2GB/10GB