Created on 10-27-2017 06:15 PM - edited 08-17-2019 10:35 AM
If you have not attended a DataWorksSummit, I highly recommend it. It is an amazing event held at three locations a year and is a great community experience. The content is deep and highly technical and you will learn about the current state of the art and what is coming next. It's not just Big Data, but AI, Streaming, Microservices, Containers, Cloud and many other topics that startups and enterprises alike need to know.
My topic was a simple talk on using Apache NiFi to ingest and transform various data types.
There is a small group forming around my quickly released Inception V3 TensorFlow Apache NiFi Processor, I encourage you to try it and provide feedback, pull requests, bug reports, documentation, unit tests, examples and more. The Java API for TensorFlow is new so this is really basic. Thanks to @Simon Elliston Ball for a major cleanup on it.
https://github.com/tspannhw/nifi-tensorflow-processor
What do we want to do?
•TensorFlow (C++, Python, Java)
via ExecuteStreamCommand
•
•TensorFlow NiFi Java Custom Processor
•
•TensorFlow Running on Edge Nodes (MiniFi)
•
•
•
•TensorFlow Mobile (iOS, Android, RPi) • •TensorFlow on Spark (Yahoo) via Livy, S2S, Kafka • •TensorFlow Running in Containers in YARN 3.0 on Hadoop •
(NiFI 1.4) gRPC Call to TensorFlow Serving
python classify_image.py --image_file/dir/solarroofpanel.jpg<br>solar dish, solar collector, solar furnace (score = 0.98316)<br>window screen (score = 0.00196)<br>manhole cover (score = 0.00070)<br>radiator (score = 0.00041)<br>doormat, welcome mat (score = 0.00041)
Python Uses
pip install -U textblob python -m textblob.download_corpora pip install -U spacy python -m spacy.en.download all pip install -U nltk pip install -U numpy
run.sh
python sentiment.py "$@”
sentiment.py sentiment.pyfrom
nltk.sentiment.vader import SentimentIntensityAnalyzer import sys sid = SentimentIntensityAnalyzer() ss = sid.polarity_scores(sys.argv[1]) print('Compound {0} Negative {1} Neutral {2} Positive {3} '.format( ss['compound'],ss['neg'],ss['neu'],ss['pos']))
These are some good Python libraries to be using. I recommend using Python 3.X unless you are stuck with 2.6/2.7.
I have also created two processors for working with text/NLP, these are listed below for Apache OpenNLP and Stanford CoreNLP.
Please comment in HCC (here), check out github and do pull requests (https://github.com/tspannhw) and come to a meetup (https://www.meetup.com/futureofdata-princeton/).
References: