1958
Posts
1211
Kudos Received
121
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
341 | 08-02-2023 07:30 AM | |
693 | 03-29-2023 01:22 PM | |
2581 | 06-03-2021 07:11 AM | |
1426 | 06-01-2021 10:05 AM | |
1414 | 05-24-2021 11:58 AM |
04-03-2023
08:12 AM
1 Kudo
CODE + COMMUNITY This week in FLaNK Stack Weekly, we have some events going on, a meetup in preparation and a lot of interesting new tools to explore. Please Join my meetup groups. We will be hybrid so if you are remote you can still see via zoom or Youtube. For those in the Princeton, New York City or Philadelphia are we will be in person as well. https://www.meetup.com/futureofdata-princeton/ https://www.meetup.com/futureofdata-newyork/ https://www.meetup.com/futureofdata-philadelphia I have a meetup in person at our San Francisco office. I will also be speaking at the Real-Time Analytics Summit that week. https://www.meetup.com/futureofdata-sanfrancisco/events/292453316/ This is Issue #77 and if you wish to look at all of our back issues, check them out in github. They are in a few different formats. https://github.com/tspannhw/FLiPStackWeekly I travel the world spreading the word of streaming, please join us. https://www.linkedin.com/pulse/schedule-2023-tim-spann-/ Videos These were the most interesting streaming videos of the week, check them out on Youtube. https://www.youtube.com/watch?v=iT60STl-Wuk https://www.youtube.com/watch?v=4X5Yky3CT6I&t=13s https://www.youtube.com/watch?v=V_DpqTo4bQ0 https://www.youtube.com/watch?v=p9-Y1PRYDn4&t=2s https://www.youtube.com/watch?v=s80sz3NWwHo Articles These were the most interesting articles of the week. https://community.cloudera.com/t5/What-s-New-Cloudera/Cloudera-DataFlow-Designer-for-self-service-data-flow/ba-p/366039 https://posthog.com/blog/dev-marketing-for-startups#its-ok-for-other-companies-to-be-much-better-than-you-at-social-media https://developerrelations.com/devrel-roundtable/looking-ahead-to-conference-season https://ossinsight.io/collections/chat-gpt-alternatives/ https://robertsahlin.substack.com/p/the-data-engineer-is-dead-long-live https://blogs.oracle.com/javamagazine/ https://www.infoq.com/articles/billions-messages-minute/? https://technology.amis.nl/big-data-database/apache-nifi-automating-tasks-using-nipyapi/ https://blog.twitter.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm https://www.linkedin.com/posts/michael-kohs-27a17525_snowpipe-snowflake-nifi-activity-7047694779786084352-ArdL/ https://thenewstack.io/linkedin-unifies-stream-and-batch-processing-with-apache-beam/ Recent Talks This is my most recent talk at the Trenton Computer Festival, I spoke on Streaming. Trenton Computer Festival Pro https://www.slideshare.net/bunkertor/itpc-building-modern-data-streaming-apps https://www.youtube.com/watch?v=iT60STl-Wuk&list=PLIJGKvnQWB-u0SPXIwozegOWCG2V85WGe&index=12 Events I have a number of events coming up soon, check them out if you can. https://www.cloudera.com/about/events/evolve.html https://web.cvent.com/event/7598f981-2f7e-4915-b662-bd7be9b5f48d/summary?RefId=homepage_impact24 April 4-6, 2023: DevNexus: Atlanta, GA. In-Person. https://devnexus.com/ April 24-26, 2023: Real-Time Analytics Summit: San Francisco, CA. In-Person. https://rtasummit.com/ April 25, 2023: Future of Data Meetup: San Francisco, CA. In-Person. https://www.meetup.com/futureofdata-princeton/ https://www.meetup.com/futureofdata-sanfrancisco/events/292453316/ May 9, 2023: Garden State Java User Group. In-Person. New Jersey https://gsjug.org/ May 10-12, 2023: Open Source Summit North America. Virtual https://events.linuxfoundation.org/open-source-summit-north-america/ May 23, 2023: Pulsar Summit Europe. Virtual https://pulsar-summit.org/ Cloudera Events https://www.cloudera.com/about/events.html More Events: https://www.linkedin.com/pulse/schedule-2023-tim-spann-/ Code There are a couple of good demos with source code available, check them out. https://github.com/pdefusco/Oozie2CDE_Migration https://github.com/SuperEllipse/edge2ai_pred_maint https://github.com/tspannhw/FLaNK-AllTheStreams https://github.com/tspannhw/CloudDemo2023 Tools There are a lot of tools I have found in the open source to be very helpful. https://github.com/bencgreenberg/stackexchange-tutorial-themes https://github.com/jaymody/picoGPT https://regex.ai/ https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.20.0/org.apache.nifi.processors.standard.JoinEnrichment/additionalDetails.html https://clickhouse.com/docs/en/integrations/nifi https://github.com/TheoKanning/openai-java https://pyscript.net/ https://tmate.io/ https://github.com/dylanaraps/neofetch https://github.com/jesseduffield/lazydocker https://github.com/httpie/httpie https://github.com/GothenburgBitFactory/taskwarrior https://github.com/newsboat/newsboat https://github.com/jarun/ddgr https://github.com/cointop-sh/cointop https://github.com/Byron/dua-cli https://nicolargo.github.io/glances/ https://github.com/aristocratos/bpytop https://github.com/hacker1024/coretemp https://github.com/bcicen/ctop https://github.com/imsnif/bandwhich https://github.com/jbruchon/jdupes https://exiftool.org/ https://github.com/aria2/aria2 https://github.com/muesli/duf https://github.com/ajeetdsouza/zoxide https://github.com/PrefectHQ/marvin https://github.com/libAudioFlux/audioFlux https://github.com/jamesturk/scrapeghost/ https://gut-cli.dev/ https://yakgpt.vercel.app/ https://github.com/HamburgChimps/apple-notes-liberator https://www.cursor.so/ https://orbstack.dev/ https://a16z.com/2023/03/30/b2b-generative-ai-synthai/ https://github.com/twitter/the-algorithm https://github.com/twitter/the-algorithm-ml https://github.com/fipso/ccurl.sh https://donuts-are-good.github.io/shhhbb/ Thanks for reading, same time next week! © 2023 Tim Spann
... View more
Labels:
03-30-2023
05:13 AM
If it does come up again, post here and we can put in a JIRA.
... View more
03-29-2023
01:22 PM
I haven't seen anyone do a key in avro, generally you want a simple key. Why avro as a key? Is this common for some use cases?
... View more
07-23-2021
05:42 AM
if you read a binary file it should be passed into NiFi with no issue.
... View more
07-22-2021
10:00 AM
You can have a QueryDataTableRecord to watch when changes happen and have that trigger your process. You may want to try Debezium with Cloudera Kafka You may want to try Debezium with Cloudera Flink SQL https://dev.to/tspannhw/simple-change-data-capture-cdc-with-sql-selects-via-apache-nifi-flank-19m4 See: https://github.com/tspannhw/EverythingApacheNiFi https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/replicate-track-change-data-capture-always-on-availability?view=sql-server-ver15 https://debezium.io/documentation/reference/connectors/sqlserver.html https://sandeepkattepogu.medium.com/streaming-data-from-microsoft-sql-server-into-apache-kafka-2fb53282115f https://www.linkedin.com/pulse/achieving-incremental-fetch-change-data-capture-via-apache-rajpal/ https://www.datainmotion.dev/2021/02/using-apache-nifi-in-openshift-and.html
... View more
07-21-2021
01:59 PM
Livy and the sparkinteractive connector aren't stable at this point. it only works with Scala code and a jar. it's hacky. i recommend you call cloudera's CDE envirnment
... View more
06-16-2021
09:50 AM
How are you ingesting the rpm, you need to get it in a flowfile as binary and then send it as body
... View more
06-03-2021
07:11 AM
Regex remove those bad characters https://community.cloudera.com/t5/Support-Questions/nifi-regex-replace-special-characters/td-p/103404 https://community.cloudera.com/t5/Support-Questions/How-do-I-enter-an-unprintable-byte-into-a-nifi-property-that/td-p/203936 https://community.cloudera.com/t5/Support-Questions/Remove-from-a-flow-file-in-Nifi/td-p/109503 https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html http://apache-nifi-users-list.2361937.n4.nabble.com/ReplaceText-and-special-characters-td480.html https://community.cloudera.com/t5/Support-Questions/ReplaceText-quot-processor-does-not-replace-special/td-p/171544 https://community.cloudera.com/t5/Support-Questions/remove-special-characters-from-xml-text-node-using-nifi/td-p/241008 https://community.cloudera.com/t5/Support-Questions/Regex-Special-Character-Escape/m-p/239556#M201365 Could also use UpdateRecord on Json with infer with replace or replaceregex https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html
... View more
06-01-2021
02:29 PM
1 Kudo
CountText will count lines (\r\n). QueryRecord will count # of records, even if it is two records on a line
... View more