1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
789 | 04-03-2024 06:39 AM | |
1528 | 01-12-2024 08:19 AM | |
782 | 12-07-2023 01:49 PM | |
1343 | 08-02-2023 07:30 AM | |
1947 | 03-29-2023 01:22 PM |
06-29-2016
02:08 PM
.NET SDK for Hadoop https://hadoopsdk.codeplex.com/wikipage?title=Simple%20Linq%20To%20Hive%20Query&referringTitle=Home .NET Driver for
Phoenix https://github.com/Azure/hdinsight-phoenix-sharp https://www.nuget.org/packages/Microsoft.Phoenix.Client/1.0.0-preview .NET for Kafka https://cwiki.apache.org/confluence/display/KAFKA/Clients#Clients-.net Mobius: C# Binding for Spark https://github.com/Microsoft/Mobius Spark for .NET
Developers Introduction https://msdn.microsoft.com/en-us/magazine/mt595756.aspx You can also look at
https://github.com/MSRCCS/Prajna
... View more
06-28-2016
03:05 PM
4 Kudos
@Timothy Spann The best way to accomplish this is with the GetTwitter processor and the MergeContent processor. GetTwitter will connect up to a Twitter dev account and pull tweets (you can even filter them). Then you can use MergeContent to collect the tweets into manageable pieces based on number of records, size of file, or a timeout value.
... View more
10-04-2017
03:57 PM
Hi Bryan do you know by when this processor will be available ?? and i think by now support for GetHBase processor is also stopped because on nifi doucments it is giving 404 - page url not found by clicking on GetHBase processor. https://nifi.apache.org/docs.html Is there any alternative to get data from HBase using nifi without any filter of timestamp???? can you please rply again???
... View more
06-27-2016
09:48 PM
You can find a ton of hbase coprocessors under: https://github.com/apache/phoenix
... View more
06-28-2016
05:51 AM
1 Kudo
Hi Kishore, If its a cluster, You will be creating your flows in NCM[Nifi Cluster Manager] UI, which runs on all the nodes in the cluster. Since you have only 2 nodes in the cluster(may be only one worker node and NCM), you may not have much to load balance there. Still you can stimulate a load balancer with nifi site-to- site protocol. you can get more info on site-to-site protocol load balancing here: https://community.hortonworks.com/questions/509/site-to-site-protocol-load-balancing.html Thanks!
... View more
06-19-2016
10:29 AM
Pig on Spark appears to be still under development, as PIG-4059, with more than 80% of sub-tasks completed. The source code is here. On the other hand Spork appears to be abandoned.
... View more
06-17-2016
03:30 PM
4 Kudos
hello Timothy There are mutilple ways to integrate these 3 services. As a starting point Nifi will probably be your ingestion flow. During this flow you could - put your data to kafka and have spark read from it - push your nifi data to spark: https://blogs.apache.org/nifi/entry/stream_processing_nifi_and_spark - you could use and execute script processor and start a pig job In summary you can have a push and forget connection, you can have a push to service and pick in next flow approach, or even execute in processor as corner case maybe hope this shares some insight
... View more
06-17-2016
12:42 PM
Lipstick Installation Resources: http://www.graphviz.org/Download_linux_rhel.php https://github.com/Netflix/Lipstick/wiki/Getting-Started Commands sudo yum list available 'graphviz*'
sudo yum -y install 'graphviz*'
./gradlew assemble I always like to rename gradlew, avengers; then ./gradlew run-app Hit your browser to view: http://localhost:9292/ Make sure you add that port/open firewall/etc... 2016-06-17 02:36:44,558 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersionPigVersionUserIdStartedAtFinishedAtFeatures
2.4.0root2016-06-17 02:36:402016-06-17 02:36:44HASH_JOIN,FILTER,LIMIT
Success!
Job Stats (time in seconds):
JobIdMapsReducesMaxMapTimeMinMapTImeAvgMapTimeMedianMapTimeMaxReduceTimeMinReduceTimeAvgReduceTimeMedianReducetimeAliasFeatureOutputs
job_local2036219587_000121n/an/an/an/an/an/an/an/afruit_names_join,fruits,limited,namesHASH_JOIN
job_local406327028_000211n/an/an/an/an/an/an/an/afruit_namesfile:/tmp/temp195796189/tmp-2027262369,
Input(s):
Successfully read 3 records from: "file:///opt/demo/certification/pig/Lipstick/quickstart/1.dat"
Successfully read 3 records from: "file:///opt/demo/certification/pig/Lipstick/quickstart/2.dat"
Output(s):
Successfully stored 1 records in: "file:/tmp/temp195796189/tmp-2027262369"
Counters:
Total records written : 1
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local2036219587_0001->job_local406327028_0002,
job_local406327028_0002
2016-06-17 02:36:44,568 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2016-06-17 02:36:44,571 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2016-06-17 02:36:44,582 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2016-06-17 02:36:44,583 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(orange,ORANGE) It's a very nice looking visualization.
... View more
Labels:
06-17-2016
02:21 AM
1 Kudo
Resources: http://www.graphviz.org/Download_linux_rhel.php https://github.com/Netflix/Lipstick/wiki/Getting-Started yum list available 'graphviz*'yum install 'graphviz*'./gradlew assemble I always like to rename gradlew, avengers. then ./gradlew run-app http://localhost:9292/ (Make sure you add that port/open firewall/etc...) That worked! 2016-06-17 02:36:44,558 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersionPigVersionUserIdStartedAtFinishedAtFeatures
2.4.0root2016-06-17 02:36:402016-06-17 02:36:44HASH_JOIN,FILTER,LIMIT
Success!
Job Stats (time in seconds):
JobIdMapsReducesMaxMapTimeMinMapTImeAvgMapTimeMedianMapTimeMaxReduceTimeMinReduceTimeAvgReduceTimeMedianReducetimeAliasFeatureOutputs
job_local2036219587_000121n/an/an/an/an/an/an/an/afruit_names_join,fruits,limited,namesHASH_JOIN
job_local406327028_000211n/an/an/an/an/an/an/an/afruit_namesfile:/tmp/temp195796189/tmp-2027262369,
Input(s):
Successfully read 3 records from: "file:///opt/demo/certification/pig/Lipstick/quickstart/1.dat"
Successfully read 3 records from: "file:///opt/demo/certification/pig/Lipstick/quickstart/2.dat"
Output(s):
Successfully stored 1 records in: "file:/tmp/temp195796189/tmp-2027262369"
Counters:
Total records written : 1
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local2036219587_0001->job_local406327028_0002,
job_local406327028_0002
2016-06-17 02:36:44,568 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2016-06-17 02:36:44,571 [main] WARN org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2016-06-17 02:36:44,582 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2016-06-17 02:36:44,583 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(orange,ORANGE)
... View more
06-16-2016
11:41 PM
Sometimes it's easy to share files: https://forums.virtualbox.org/viewtopic.php?t=15679 You just pick a directory, set Auto-mount "Yes" and Access to "Full" and hit OK. For some of us, depending on versions of VirtualBox, the VM and host operating system; things might not work. It also can get broken when host operating system or VM updates. Log into the VM as root and try this (if it's HDP sandbox or another running Centos). cd /opt/VBoxGuestAdditions-*/init sudo ./vboxadd setup
modprobe -a vboxguest vboxsf vboxvideo
rm -rf /media/sf_Downloads
mkdir /media/sf_Downloads
mount -t vboxsf Downloads /media/sf_Downloads
For me that worked and my Downloads directory was shared so I could move files to my Sandbox and off for development. There are some other things you can try and certainly rebooting everyone helps. For me, this worked fine.
... View more
Labels: