Member since
10-17-2016
93
Posts
10
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4885 | 09-28-2017 04:38 PM | |
7335 | 08-24-2017 06:12 PM | |
1900 | 07-03-2017 12:20 PM |
06-08-2017
03:35 PM
Hi George I am not using the sandbox but rather have a standalone installation of spark and nifi on my pc I am using apache nifi 1.2.0 and I have followed the entire tutorial. I get the error in import org.apache.nifi.events._ <console>:38: error: object events is not a member of package org.apache.nifi
import org.apache.nifi.events._ I have included all the relevant jars that you have mentioned. nifi-site-to-site-client-1.2.0.jar nifi-spark-receiver-1.2.0.jar nifi-api-1.2.0.jar nifi-utils-1.2.0.jar nifi-client-dto-1.2.0.jar I opened all the jars and sure enough there in no directory org.apache.nifi.events in any of the jars. How can i find this missing import? also i tried to run the code in intellij i dont get any errors but i get the following warning: 17/06/08 18:16:14 INFO ReceiverSupervisorImpl: Stopping receiver with message: Registered unsuccessfully because Driver refused to start receiver 0 i copied the following code in Intellij. i commented the last line // Import all the libraries required
import org.apache.nifi._
import java.nio.charset._
import org.apache.nifi.spark._
import org.apache.nifi.remote.client._
import org.apache.spark._
import org.apache.nifi.events._
import org.apache.spark.streaming._
import org.apache.spark.streaming.StreamingContext._
import org.apache.nifi.remote._
import org.apache.nifi.remote.client._
import org.apache.nifi.remote.protocol._
import org.apache.spark.storage._
import org.apache.spark.streaming.receiver._
import java.io._
import org.apache.spark.serializer._
object SparkNiFiAttribute {
def main(args: Array[String]) {
/*
import java.util
val additionalJars = new util.ArrayList[String]
additionalJars.add("/home/arsalan/NiFiSparkJars/nifi-site-to-site-1.2.0.jar")
*/
val config = new SparkConf().setAppName("Nifi_Spark_Data")
// .set("spark.driver.extraClassPath","/home/arsalan/NiFiSparkJars/nifi-site-to-site-client-1.2.0.jar:/home/arsalan/NiFiSparkJars/nifi-spark-receiver-1.2.0.jar:/home/arsalan/nifi-1.2.0/lib/nifi-api-1.2.0.jar:/home/arsalan/nifi-1.2.0/lib/bootstrap/nifi-utils-1.2.0.jar:/home/arsalan/nifi-1.2.0/work/nar/framework/nifi-framework-nar-1.2.0.nar-unpacked/META-INF/bundled-dependencies/nifi-client-dto-1.2.0.jar")
.set("spark.driver.allowMultipleContexts", "true")
.setMaster("local[*]")
// Build a Site-to-site client config with NiFi web url and output port name[spark created in step 6c]
val conf = new SiteToSiteClient.Builder().url("http://localhost:8080/nifi").portName("Data_to_Spark").buildConfig()
// Set an App Name
// Create a StreamingContext
val ssc = new StreamingContext(config, Seconds(1))
ssc.sparkContext.getConf.getAll.foreach(println)
// Create a DStream using a NiFi receiver so that we can pull data from specified Port
val lines = ssc.receiverStream(new NiFiReceiver(conf, StorageLevel.MEMORY_ONLY))
// Map the data from NiFi to text, ignoring the attributes
val text = lines.map(dataPacket => new String(dataPacket.getContent, StandardCharsets.UTF_8))
// Print the first ten elements of each RDD generated
text.print()
// Start the computation
ssc.start()
}
}
//SparkNiFiAttribute.main(Array())
... View more
06-05-2017
09:16 AM
Thankyou for your reply. I see that you are using version 1.3.0 which I do not have. I tried to import the template but i get an error saying the UpdateRecord possessor is not known. Is the nar file available?
... View more
06-04-2017
04:50 PM
Hi I have multiple csv files where each file contains an attribute value at a given time. There are a total of 60 files (60 different attributes). These are basically Spark's Metric Dump. example: The file name is the name of the application followed by the attribute name. For the example below the application name is :local-1495979652246 and attribute for the first file is: BlockManager.disk.diskSpaceUsed_MB local-1495979652246.driver.BlockManager.disk.diskSpaceUsed_MB.csv local-1495979652246.driver.BlockManager.memory.maxMem_MB.csv local-1495979652246.driver.BlockManager.memory.memUsed_MB.csv each file contains values like: t value 1496588167 0.003329809088456 1496588168 0.00428465362778284
The file name specifys the name of the attribute. The first thing i need to do is to update csv header field called value to the attribute name from the filename t BlockManager.disk.diskSpaceUsed_MB 1496588167 0.003329809088456 the next thing would be to merge all files for the same application based on the value of the filed t. and eventually I should have one csv file for each application containing the values for all the attributes like: t BlockManager.disk.diskSpaceUsed_MB BlockManager.memory.maxMem_MB BlockManager.memory.memUsed_MB more attributes... 1496588167 0.003329809088456 some value some value some value 1496588168 0.00428465362778284 some value come value .. any suggestions?
... View more
Labels:
- Labels:
-
Apache NiFi
05-14-2017
06:55 PM
Hi I do know there are a number of threads posted about how to run a spark job from NiFi, but most of them explain a setup on HDP. I am using windows. I have spark and NiFi locally installed. Can anyone explain how can I configure the Execute Process to run the following command (which I run in the command line and it works) spark-submit2.cmd --class "SimpleApp" --master local[4] file:///C:/Simple_Project/target/scala-2.10/simple-project_2.10-1.0.jar
... View more
Labels:
- Labels:
-
Apache NiFi
-
Apache Spark
01-16-2017
12:14 PM
2 Kudos
Hi ONE Firstly I would want to know, where can I find what features will be added in the next NiFi version relating to data provenance. TWO What are the current limitations/bottlenecks when it comes to provenance? THREE What about provenance beyond NiFi? Meaning after making some transformations in the data I send it to Hadoop for a Map/Reduce job. Later the processed data is ingested back in. As far as I know, at the moment this will be treated as a new ingestion (A RECEIVE event)from NiFis perspective. Is there a way to know that the data I am ingesting is the same as the one that was sent out? FOUR Are there any other limitations or problems for which solution may be required. Also how secure is the provenance information? I do see the files inside the provenance_repository folder. Can these files be simply modified to compromise the provenance information. FIVE also is it possible to add additional information to the existing provenance information? is it extendable? I intend to research data provenance at my university. This could include performance improvement security possible solution for a limitation etc a desired feature Any help is highly appreciated. You can also point me to any link which can help me better understand the working of the current system. Regards Arsalan
... View more
Labels:
- Labels:
-
Apache NiFi
12-16-2016
10:40 AM
Where can i see the files on the sandbox which i put
using the PutHDFS processor. I do understand they may not be directly visible
in the file system (by directly logging into the terminal on the sandbox)as
they are on the DataNode but how can i see the files that are present on the
datanode. The ambari file view displays the files on the DataNode i guess.I
can also navigate to view the files on the data node on sandbox.hortonworks.com:50070/explorer.html#/ I have also added GNOME GUI to the sandbox. The files
that I see in "Computer" are different from the ones i see when i do
a "dir" in the terminal. Which is a bit confusing. Lastly i put a file using PutHDFS in a folder a
directory called NiFi. where can i see this? i dont see it on the data node
either. 01 image: shows files when connected to the sandbox
via Putty 02 image: Shows the Computer files in
the Sandbox 03 image: Shows the files by navigating
to HDFS.Quicklinks.namenodeui.utilities.browse file system (same as FileView in
Ambari) (i guess this shows the name node and
not the data node) 04 image:PutHDFS configuration which
puts the file to Nifi directory. A getHDFS also fetched the file. Lastly how do i get a file from my local PC in Nifi
now which is running inside the sandbox.
... View more
Labels:
12-16-2016
09:07 AM
Thanks again..... So you can see in the pictures below... i used putty... then ssh via the sandbox terminal.... i got the file transfer part working using PSCP....
... View more
12-15-2016
12:07 PM
@Sonu Sahi @Matt Foley Thankyou for your detailed reply.....unfortunately the ssh connection to transfer file does not work and i keep getting the connection refused error.... i have tried everything mentioned here...SCP Conn Refused but still no luck..... The last thing i would want to clarify is why when i connect via Putty to root@127.0.0.1 p 2222 in windows and do a "dir" command, i see different folders as to when i do a DIR in virtualbox terminal??
... View more
12-14-2016
12:49 PM
Now that I am using NiFi inside the HDP.....and say i am using the GetHDFS processor..... what and how should i specify the path to the HDFS configuration files? where is the core-site.xml file located?
... View more