Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)
Super Guru

Related: ApacheCon 2018 in Montreal

With my talk in a dual language city, I thought perhaps I should do my talk in French. My college French is very rusty and my accent is very New Jersey. The two don't mix well. So let's have Apache NiFi do it for me. After publically debating this on Twitter, I decided to see if I could implement a solution.

Secret:

Most of the heavy lifting is done by Python which calls Google Translate API under the covers, automagically.

My Presentation is here: https://www.slideshare.net/bunkertor/apache-deep-learning-101-apachecon-montreal-2018-v031

Flow to Extract French

92595-nififlowconverttofrench.png

Apache Tika extracts the text from PDF or PPTX and converts to text or HTML. ( I chose text).

Run the Translate Python with a sentence extracted from the PDF or PPTX.

92596-runtranslate.png

Let's send that translated french line to it's own slack channel.

92597-slackfrenchsend.png

Let's send the english to another.

92598-slackengsend.png

And there it is:

92599-englishslackresult.png

92600-frenchslackresults.png


runtranslate.sh

python3.6 -W ignore /Volumes/TSPANN/2018/talks/IOT/translate.py "$1" 2>/dev/null

translate.py

from textblob import TextBlob
import sys
text = ""
for x in sys.argv[1:]:
    text += str(x)
#text = sys.stdin.read()
#print(text)
blob = TextBlob(text)
#for sentence in blob.sentences:
#    print(sentence.sentiment.polarity)
# 0.060
# -0.341
print(blob.translate(to="fr") )

NiFi Flow

make-it-french.xml

504 Views
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
2 of 2
Last update:
‎08-17-2019 06:22 AM
Updated by:
 
Contributors
Top Kudoed Authors