Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Find Fields in Noise with Spark

avatar
Master Guru

So we have 100 different spreadsheets in CSV format with 20 fields. The fields are kind of standard, but some people use First Name, some use Name or firstname, some use one name field. Some use M and F for gender; some use 0 and 1.

We want to convert all these types of CSVs into one gold standard and standard fieldnames/types/rangers.

1 ACCEPTED SOLUTION

avatar
Master Guru
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
3 REPLIES 3

avatar
Master Guru
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar
Master Guru

I am wondering about a full open source solution for Master Data Management.

avatar
Master Guru

Would be interesting to see. There seem to be a couple data quality tools out there in the open source commnity mural/mosaic but the last update in the repository seems to have been 4 years ago. So not sure how useful that is.

https://java.net/projects/mosaic