Member since
01-30-2017
49
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3642 | 02-23-2017 07:54 AM |
02-02-2022
08:11 AM
Sorry, there is a mode to disable polling option? If i want to load some files in example 3 files and only them and after i want load this to a next rerun of the job as i can to disable the polling function to prevent that the processor is always in waiting or in listening!? Regards, Daniele
... View more
04-07-2017
02:06 PM
@Pradhuman
Gupta Apache Spark cannot do that out of the box. But
you might be looking for is somemiddlewarethat can interface with Apache
Spark and run, submit and manage jobs for you. Livy - REST
server with extensive language support (Python, R, Scala), ability to maintain
interactive sessions and object sharing. spark-jobserver -
A simple Spark as a Service which supports objects sharing using so called
named objects. JVM only. Mist -
A service for exposing Spark analytical jobs and machine learning models as
realtime, batch or reactive web services. Apache Toree -
IPython protocol based middleware for interactive applications. Hortonworks recommends Livy. Also, read the last comment at: https://issues.apache.org/jira/browse/SPARK-2243 allowMultipleContexts has a very limited use for test only
and can lead to the error you see.
... View more
04-06-2017
11:52 PM
2 Kudos
A simple Spark1 Java application to show a list of tables in Hive Metastore is as follows: import org.apache.spark.SparkContext;
import org.apache.spark.SparkConf;
import org.apache.spark.sql.hive.HiveContext;
import org.apache.spark.sql.DataFrame;
public class SparkHiveExample {
public static void main(String[] args) {
SparkConf conf = new SparkConf().setAppName("SparkHive Example");
SparkContext sc = new SparkContext(conf);
HiveContext hiveContext = new org.apache.spark.sql.hive.HiveContext(sc);
DataFrame df = hiveContext.sql("show tables");
df.show();
}
} Note that Spark pulls metadata from Hive metastore and also uses hiveql for parsing queries but the execution of queries as such happens in the Spark execution engine.
... View more
03-30-2017
12:51 AM
2 Kudos
Similar to https://community.hortonworks.com/questions/19897/apache-spark-error-unable-to-load-native-hadoop-li.html
... View more
03-03-2017
02:54 PM
@Matt Clarke Hi Matt, Thank you I understood it now, I will go through documentation to learn more on this. Thanks
... View more
02-24-2017
01:51 PM
Can you take a thread dump and provide the output here? ./bin/nifi.sh dump /path/to/output/dump.txt
... View more
02-24-2017
11:36 AM
Thanks Arun, you are right. I did it using TailFile. @Arun A K
... View more
02-22-2017
01:36 PM
2 Kudos
@Pradhuman Gupta Which "Protocol" are you using in your PutSplunk processor.
There is no assurance or delivery if you are using UDP. With TCP protocol there is confirmed delivery. You could use NiFi's provenance to track the FlowFile's processed by the PutSplunk processor. This will allow you to get the details on FlowFiles that have "SEND" provenance events associated to them. Thanks, Matt
... View more
02-23-2017
07:54 AM
Thanks @Bryan Bende @Timothy Spann, @ozhurakousky foryour reply. It was some configuration issue. While trying to put file into Splunk, I was using web-port (8081 in my case) of splunk in configuration of PutSplunk. When I pointed my PutSplunk configuration to TCP port of Splunk(In Splunk setting go to Data Inputs -> Click on TCP and enter details as instructed to create a new TCP input port of Splunk) it started working properly.
... View more