Member since
02-26-2017
12
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1940 | 03-20-2017 12:10 PM |
04-04-2017
06:16 AM
Hi Matt, I created a simple implementation of a NiFi OutputStreamCallback in Python ExecuteScript and successfully transferred data to next processor in my flow. However, when I try to enrich the code and add desired logic to that, data is simply not transferred. I don't see any compilation error and it even logs suggest that ExecuteScript ran without any error. Appreciate if you could help. Below is the snippet which simply calls function and write data to the flow file. import urllib2
import json
import datetime
import csv
import time
import sys
import traceback
from org.apache.nifi.processor.io import OutputStreamCallback
from org.python.core.util import StringUtil
class WriteContentCallback(OutputStreamCallback):
def __init__(self, content):
self.content_text = content
def process(self, outputStream):
try:
outputStream.write(StringUtil.toBytes(self.content_text))
except:
traceback.print_exc(file=sys.stdout)
raise
page_id = "dsssssss"
access_token = "sdfsdfsf%sdfsdf"
def scrapeFacebookPageFeedStatus(page_id, access_token):
flowFile = session.create()
flowFile = session.write(flowFile, WriteContentCallback("Hello there this is my data"))
flowFile = session.write()
session.transfer(flowFile, REL_SUCCESS)
has_next_page = False
num_processed = 0 # keep a count on how many we've processed
scrape_starttime = datetime.datetime.now()
while has_next_page:
print "Scraping %s Page: %s\n" % (page_id, scrape_starttime)
has_next_page = False
print "\nDone!\n%s Statuses Processed in %s" % \
(num_processed, datetime.datetime.now() - scrape_starttime)
if __name__ == '__main__':
scrapeFacebookPageFeedStatus(page_id, access_token)
flowFile = session.create()
flowFile = session.write(flowFile, WriteContentCallback("and your data"))
session.transfer(flowFile, REL_SUCCESS)
... View more
03-23-2017
09:30 AM
Hi @Vasilis Vagias Great article. I am scraping facebook page contents using python and want to used executescript processor to get all the posts returned by python function and pass it on to solr processor for indexing. Currently I am writing the contents returned by facebook in a file and I want to put those contents to the output stream instead and pass on to next processsor.Can you please share the steps with regards to the example given in your article? Can I used outputstream object in any python function and use it for writing records; I don't think creating inner class is mandatory. Also, does it allow writerow () kind of functionality ? Appreciate showing me the way here. Thanks.
... View more
03-20-2017
12:10 PM
solved. defined a single valued field and do term on that.. use schema.xml to define fields explicitly.
... View more
03-15-2017
09:41 AM
Hi Team, I am trying to do aggregation on top of Solr data using banana dashboard but getting error on the panel "stats can only run on single valued column not multivalued". In fact, data I am trying to aggregate is about facebook posts about certain company (I have three different companies)and number of likes, shares etc. available on post / status level (almost 2000 posts per company). I want to compute and show a graph for total likes / shares per company like in SQL i select company_name , sum (num_likes) from solrdata group by company_name I followed the documentation of banana dashboard and as per that, I need facets and stats and TERMS panel. I created "TERMS" panel on the dashboard and provided coompany_name column from the data as facet and in the stats field I provided 'num_of_likes' --> total likes per post. Still same error. Below are snapshots and of course the error!!!!
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Solr
03-08-2017
05:59 AM
1 Kudo
Hi @Timothy Spann could you please share some details regarding how to handle paging aspect through NiFi? Like which processors to use and how to modify the query to get more next page feeds and in a particular feed how to get all the comments. Thanks. Omer
... View more
03-02-2017
06:31 PM
Hi @Aldrin Piri Great ! It worked. Thanks alot for your support 🙂 Cheers, Omer
... View more
03-02-2017
07:51 AM
Hi @Aldrin Piri I am facing the same challenge. I configured the ssl context service after adding facebook certificate to default java cacerts truststore but my getHTTP is showing error of illegal arguement exception in the url. Below is the screenshot. appreciate if you could help me on this. regards,, Omer
... View more
03-02-2017
06:57 AM
Hi @Andy LoPresto I am still struggling with that. Tried to add certificate to the truststore as well as you mentioned in your posts however,still getHTTP is not working. It is showing me an error in the access token which is working fine if I put that in the browser. I am yusing the template provided by github. SSL context service is also enabled. Highly appreciate your support. Thanks.
... View more