1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2233 | 04-03-2024 06:39 AM | |
| 3536 | 01-12-2024 08:19 AM | |
| 1883 | 12-07-2023 01:49 PM | |
| 2711 | 08-02-2023 07:30 AM | |
| 3830 | 03-29-2023 01:22 PM |
11-07-2016
04:21 PM
1 Kudo
As a follow-up to my article on downloading and playing MIDI files (https://community.hortonworks.com/content/kbentry/65154/nifi-1x-for-automatic-music-playing-pipelines.html) This quick article highlights how you can support both MP3 and MIDI. As you can see it's very easy to have multiple channels in, multiple processing paths and process many types of inputs and files in NIFI. So how do we add MP3 playback to our Jukebox? Add another GetHTTP processor http://192.168.1.2:8080/extract/url?url=http://www.amclassical.com/piano//&type=mp3 Still using my Microservice to convert web pages into links in JSON, this time filtering out MP3s. Fortunately there's a few web pages out there with free classical MP3s. Next after UpdateAttribute, I check file extension in a Route on Attribute. ${filename:contains('midi')} If it's MIDI, call the same MIDI player. I added a second PutFile to save to /opt/demo/mp3, so that I keep my music files seperately. Otherwise, call OSX's command line MP3 player. In my second ExecuteStreamCommand I call /usr/bin/afplay to play those newly downloaded MP3s. It will play and once completed the next song will play. I don't recommend feeding both MIDI and MP3 pages full at the same time. It's best to pick one and let it load up a lot of files in the queues from one type and play those. I keep my GetHTTP's stopped once I get the page in as I don't want more. it is very easy to feed in a list of pages to load, add a scheduler or other feed logic in the start to control your experience. I like to manually control this part so I have control. You could also trigger this by the presence of a file or maybe when a Jenkins build fails. It's limited by your imagination and over 180 processors. Another thing that can be added to the flow is some Audio Processing via Simon Elliston Ball's Audio Processors, that you can easily add to your NIFI. https://community.hortonworks.com/content/repo/47306/nifi-audio-processors.html
... View more
Labels:
11-07-2016
04:36 AM
2 Kudos
Use Case Before meetups, I wanted to play some music. NiFi seemed like a great choice for streaming free music through my Mac.
MIDI Command Line Player For OSX brew install timidity Then you can simply play MIDI files with timidity file.mid. Microservice to Extract Links from Web Pages Java 8 Source Code: https://github.com/tspannhw/linkextractor The Spring Boot REST API accepts a URL, parses out mid files and returns JSON containing linking and descriptions. Example REST Call to Service curl -G -v "http://<urL>:8080/extract/url?url=http://www.midiworld.com/classic.htm/&type=mid" Run the Microservice java -Xms512m -Xmx2048m -Djava.net.preferIPv4Stack=true -jar target/linkextractor-0.0.1-SNAPSHOT.jar Java Snippet Using JSoup to extract links from URL (HTML) pLink = new PrintableLink();
pLink.setLink(link.attr("abs:href"));
pLink.setDescr(trim(link.text(), 100));
linksReturned.add(pLink); Output hdfs dfs -ls /music/*.mid
-rw-r--r-- 3 tspann hdfs 87 2016-11-07 03:51 /music/2_ase.mid
-rw-r--r-- 3 tspann hdfs 99 2016-11-07 03:51 /music/4_mtking.mid
-rw-r--r-- 3 tspann hdfs 105 2016-11-07 03:50 /music/EspanjaCaphriccoCatalan.mid
-rw-r--r-- 3 tspann hdfs 87 2016-11-07 03:50 /music/EspanjaPrelude.mid
-rw-r--r-- 3 tspann hdfs 162 2016-11-07 03:50 /music/J_M_Bach_Auf_lasst_uns_den_Herren_loben.mid
-rw-r--r-- 3 tspann hdfs 93 2016-11-07 03:52 /music/adelina.mid
-rw-r--r-- 3 tspann hdfs 95 2016-11-07 03:52 /music/aida_ii2.mid
-rw-r--r-- 3 tspann hdfs 89 2016-11-07 03:50 /music/al_adagi.mid
-rw-r--r-- 3 tspann hdfs 95 2016-11-07 03:52 /music/alborada.mid
-rw-r--r-- 3 tspann hdfs 82 2016-11-07 03:52 /music/aquarium.mid
-rw-r--r-- 3 tspann hdfs 105 2016-11-07 03:52 /music/barbero.mid
-rw-r--r-- 3 tspann hdfs 101 2016-11-07 03:51 /music/barimyst.mid
-rw-r--r-- 3 tspann hdfs 80 2016-11-07 03:52 /music/beevar2.mid
-rw-r--r-- 3 tspann hdfs 111 2016-11-07 03:50 /music/biz_arls.mid
-rw-r--r-- 3 tspann hdfs 94 2016-11-07 03:51 /music/blas1.mid
-rw-r--r-- 3 tspann hdfs 114 2016-11-07 03:50 /music/boccher.mid
-rw-r--r-- 3 tspann hdfs 78 2016-11-07 03:52 /music/bolero.mid
-rw-r--r-- 3 tspann hdfs 100 2016-11-07 03:51 /music/cantique.mid
-rw-r--r-- 3 tspann hdfs 88 2016-11-07 03:51 /music/carminab.mid
-rw-r--r-- 3 tspann hdfs 96 2016-11-07 03:51 /music/clairdelune.mid
-rw-r--r-- 3 tspann hdfs 87 2016-11-07 03:51 /music/cmveder.mid
-rw-r--r-- 3 tspann hdfs 108 2016-11-07 03:52 /music/coucou.mid
-rw-r--r-- 3 tspann hdfs 99 2016-11-07 03:51 /music/coup8a.mid
-rw-r--r-- 3 tspann hdfs 97 2016-11-07 03:51 /music/cpf-bird.mid
2016-11-06 22:57:20.095 ERROR 28694 --- [nio-8080-exec-1] com.dataflowdeveloper.DataController : Query:http://www.midiworld.com/classic.htm/ mid,IP:192.168.1.2 Browser:nifi-agent
2016-11-06 22:57:20.313 ERROR 28694 --- [nio-8080-exec-3] com.dataflowdeveloper.DataController : Query:http://www.midiworld.com/classic.htm/ mid,IP:192.168.1.2 Browser:nifi-agent
2016-11-06 22:57:20.500 ERROR 28694 --- [nio-8080-exec-5] com.dataflowdeveloper.DataController : Query:http://www.midiworld.com/classic.htm/ mid,IP:192.168.1.2 Browser:nifi-agent
ls -lt /opt/demo/midi | more
total 20456
-rw-r--r-- 1 tspann staff 117731 Nov 6 22:58 appspg13.mid
-rw-r--r-- 1 tspann staff 13449 Nov 6 22:58 intrlude.mid
-rw-r--r-- 1 tspann staff 8777 Nov 6 22:58 latalant.mid
-rw-r--r-- 1 tspann staff 1911 Nov 6 22:58 lbvar2.mid
-rw-r--r-- 1 tspann staff 2230 Nov 6 22:58 lbvar4.mid
-rw-r--r-- 1 tspann staff 1458 Nov 6 22:58 lbvar6ep.mid
NiFi Flow GetHTTP: Call JSoup Microservice that converts HTML page full of MIDI links into JSON file of links and descriptions of MIDI files. SplitJSON: Split that big JSON file into individual link, description pairs for working with individual songs. EvaluateJSONPath: Use JSONPATH to pull out link and description as attributes. InvokeHTTP: Download the MIDI file from the link. UpdateAttribute: Give it a good file name. I just want the file name from the link (example http://sdfsdf:8080/test.mid PutFile: Store the MIDI on the OSX filesystem. ExecuteStreamCommand: Run the timidity CLI to play the MIDI file. Pass a link to the stored MIDI file to timidity player. PutHDFS: Store the MIDI on HDP 2.5 HDFS.
References http://jsonpath.com/ http://macappstore.org/timidity/
... View more
Labels:
11-04-2016
08:22 PM
You cannot install NIFI on the same cluster as HDP 2.5. NIFI as HDF needs it's own cluster and it's own Ambari. Sandbox has a workaround, but you shouldn't mix HDP ambari and HDF ambari. It will corrupt ambari. You can install HDF via Ambari to it's own fresh cluster and fresh ambari. Follow these instructions on a clean cluster with no ambari, no hdp, just root access and a nice Linux like Centos 7.2 http://docs.hortonworks.com/HDPDocuments/HDF2/HDF-2.0.1/bk_ambari-installation/content/index.html
... View more
11-03-2016
08:08 PM
1 Kudo
Starting My Hadoop Tools NiFi can interface directly with Hive, HDFS, HBase, Flume and Phoenix. And I can also trigger Spark and Flink through Kafka and Site-To-Site. Sometimes I need to run some Pig scripts. Apache Pig is very stable and has a lot of functions and tools that make for some smart processing. You can easily augment and add this piece to a larger pipeline or part of the process. Pig Setup I like to use Ambari to install the HDP 2.5 clients on my NiFi box to have access to all the tools I may need. Then I can just do: yum install pig Pig to Apache NiFi 1.0.0 ExecuteProcess We call a shell script that wraps the Pig script. Output of script is stored to HDFS: hdfs dfs -ls /nifi-logs
Shell Script export JAVA_HOME=/opt/jdk1.8.0_101/
pig -x local -l /tmp/pig.log -f /opt/demo/pigscripts/test.pig
You can run in different Pig modes like local, mapreduce and tez. You can also pass in parameters or the script. Pig Script messages = LOAD '/opt/demo/HDF/centos7/tars/nifi/nifi-1.0.0.2.0.0.0-579/logs/nifi-app.log';
warns = FILTER messages BY $0 MATCHES '.*WARN+.*';
DUMP warns
store warns into 'warns.out'
This is a basic example from the internet, with the NIFI 1.0 log used as the source. As an aside, I run a daily script with the schedule 1 * * * * ? to clean up my logs. Simply: /bin/rm -rf /opt/demo/HDF/centos7/tars/nifi/nifi-1.0.0.2.0.0.0-579/logs/*2016* PutHDFS Hadoop Configuration: /etc/hadoop/conf/core-site.xml Pick a directory and store away. Results HadoopVersionPigVersionUserIdStartedAtFinishedAtFeatures
2.7.3.2.5.0.0-12450.16.0.2.5.0.0-1245root2016-11-03 19:53:572016-11-03 19:53:59FILTER
Success!
Job Stats (time in seconds):
JobIdMapsReducesMaxMapTimeMinMapTimeAvgMapTimeMedianMapTimeMaxReduceTimeMinReduceTimeAvgReduceTimeMedianReducetimeAliasFeatureOutputs
job_local72884441_000110n/an/an/an/a0000messages,warnsMAP_ONLYfile:/tmp/temp1540654561/tmp-600070101,
Input(s):
Successfully read 30469 records from: "/opt/demo/HDF/centos7/tars/nifi/nifi-1.0.0.2.0.0.0-579/logs/nifi-app.log"
Output(s):
Successfully stored 1347 records in: "file:/tmp/temp1540654561/tmp-600070101"
Counters:
Total records written : 1347
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local72884441_0001 Reference: http://hortonworks.com/hadoop-tutorial/hello-world-an-introduction-to-hadoop-hcatalog-hive-and-pig/#section_5 http://hortonworks.com/apache/pig/#section_2 http://hortonworks.com/blog/jsonize-anything-in-pig-with-tojson/ https://github.com/dbist/pig https://github.com/sudar/pig-samples http://hortonworks.com/hadoop-tutorial/how-to-use-basic-pig-commands/ http://hadooptutorial.info/built-in-load-store-functions-in-pig/ https://cwiki.apache.org/confluence/display/PIG/PigTutorial https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.3/bk_installing_manually_book/content/validate_the_installation_pig.html http://pig.apache.org/docs/r0.16.0/start.html http://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-pig https://github.com/alanfgates/programmingpig/tree/master/examples/ch2
... View more
Labels:
11-03-2016
05:35 PM
There is no SQL join available from two different SQL sources. So we have to merge the resulting files together and then use routing to just get the ones matching. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MergeContent/index.html See these parameters: Attribute Strategy Keep Only Common Attributes Keep Only Common Attributes Keep All Unique Attributes Determines which FlowFile attributes should be added to the bundle. If 'Keep All Unique Attributes' is selected, any attribute on any FlowFile that gets bundled will be kept unless its value conflicts with the value from another FlowFile. If 'Keep Only Common Attributes' is selected, only the attributes that exist on all FlowFiles in the bundle, with the same value, will be preserved. Correlation Attribute Name If specified, like FlowFiles will be binned together, where 'like FlowFiles' means FlowFiles that have the same value for this Attribute. If not specified, FlowFiles are bundled by the order in which they are pulled from the queue.
Supports Expression Language: true Correlation attribute name could be used for joining. This really needs to be pushed to the source or to ingest both datasources to HDFS and then run the SQL query there.
... View more
11-03-2016
03:45 PM
did you kerberos init? can you access Phoenix command line? /usr/hdp/current/phoenix-client/bin/sqlline.py santhosh-blueprint-test-11.XXX:2181:/hbase-secure;
... View more
11-03-2016
01:44 PM
are you running anything on that port on your PC? You can't share ports. NIFI must have the port change away from 8080. Lots of things like Ambari want to run on port 8080 You cannot install NIFI with Ambari on a machine that already has Ambari + HDP. Networking between a sandbox and raspberry pis is probably going to be tricky and messy as PC run VMs don't want to be used as servers
... View more
11-03-2016
01:23 PM
I'll pass this on to documentation team.
... View more
11-03-2016
01:22 PM
is the kerberos client enabled you are accessing hbase-secure if you do hbase-unsecure you can just login. it's a login issue
... View more
11-03-2016
01:17 PM
From the NIFI User Group Mailing List by @jwitt: Split with Grouping: Take a look at RouteText. This allows you to efficiently split up line oriented data into groups based on matching values rather than spilt text which will be a line for line split. Merge Grouped Data: MergeContent processor will do the trick and you can use correlation feature to align only those which are from the same group/pattern. Write to destination: You can write directly to HDFS using PutHDFS or you can prepare the data and write to Hive.
... View more