Support Questions

Find answers, ask questions, and share your expertise

Is there any tool in Hadoop which can do the language translation on my data?

avatar
Rising Star

I have data in a SQL Server RDBMS. The data is in French and I need to save that data on hdfs. I also need the data translated into English.

1 ACCEPTED SOLUTION

avatar
Guru

There are a number of online translation services which can be used to do this. Most of them work as REST APIs, which you can integrate into your ingestion process, whether that is through realtime ingest via something like Storm, or post processing through a custom UDF, or Oozie process.

Something to look at would be the YandexTranslate processor in Hortonworks Data Flow. So you could for example use the ExecuteSQL process to get data out of your SQL Server and then translate the content with the YandexTranslate processor, before using PutHDFS to store the data in HDP.

View solution in original post

2 REPLIES 2

avatar
Guru

There are a number of online translation services which can be used to do this. Most of them work as REST APIs, which you can integrate into your ingestion process, whether that is through realtime ingest via something like Storm, or post processing through a custom UDF, or Oozie process.

Something to look at would be the YandexTranslate processor in Hortonworks Data Flow. So you could for example use the ExecuteSQL process to get data out of your SQL Server and then translate the content with the YandexTranslate processor, before using PutHDFS to store the data in HDP.

avatar
Master Mentor

@bandhu gupta has this been resolved? Can you accept best answer or provide your own solution?