Member since
12-09-2017
10
Posts
2
Kudos Received
0
Solutions
02-26-2018
08:17 AM
Still looking for some help on this. Need scrollelastic to run once a day for more than 10,000 records.
... View more
02-07-2018
05:31 PM
Hi @Shu, Thanks for the response. I tried increasing the back pressure of the success queue. However, I still get just one flowfile of 10000 records. It seems to be so because scrollelastic processor reads the elastic index and get data in batches of 10000 records. So it just reads it once when scheduled by cron.
... View more
02-07-2018
12:47 PM
I am currently using scroll elastic processor in nifi to read data from the previous day based on timestamp. I want to schedule this activity once a day. The problem I am facing with schedule is that when I set a cron interval, the scroll elastic processor fetched the data only once i.e. 10000 records which is the limit and does not continue to fetch more records based on the input query.
... View more
Labels:
- Labels:
-
Apache NiFi
12-13-2017
01:30 PM
@bkosaraju I did try without the kyro serializer. The issue is Inam able to load the oracle table. But the moment I use show or count, it goes on a ending run.
... View more
12-12-2017
08:57 PM
I am trying to load an oracle db table in Spark through Zeppelin. I am using the below code to load the table
val df = spark.read.format("jdbc")
.option("url","jdbc:oracle:thin:****/***@hostip/appname")
.option("driver", "oracle.jdbc.OracleDriver")
.option("dbtable", "schema.tablename")
.load()
The above code executes successfully in Zeppelin with msg displayed
df: org.apache.spark.sql.DataFrame = [fieldname1: int, fieldname2: string ... 92 more fields]
However, when I try to print the above df through df.count, df.collect I get the below error
I get the same error if do a
val df_count = spark.sql("select count (*) from count")
df_count.show
org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: java.util.NoSuchElementException: key not found: -1024
java.util.NoSuchElementException: key not found: -1024
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:59)
at scala.collection.mutable.HashMap.apply(HashMap.scala:65)
I am using the below code to set spark session
import org.apache.spark.sql._
import org.apache.spark.sql.SparkSession
import spark.implicits._
val spark = SparkSession
.builder
.appName("test_app")
.master("local[*]")
.config("spark.kryo.registrator", "org.bdgenomics.adam.serialization.ADAMKryoRegistrator")
.getOrCreate()
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache Zeppelin
12-11-2017
08:06 PM
1 Kudo
I am not able to move beyond step 6. After creating the remote process group, I am getting an error 'http://127.0.0.1:8080/nifi' does not have any input ports.
... View more
12-09-2017
05:13 PM
1 Kudo
@shu Thanks. It worked.
... View more
12-09-2017
03:16 PM
Sample Data:- 1,Michael,Jackson 2,Jim,Morrisson 3,John,Lennon 4,Freddie,Mercury 5,Elton,John refer image CSVtoJSONJSON created successfully Result After Convert Record (CSVtoJSON) [ {
"id" : 1,
"firstName" : "Michael",
"lastName" : "Jackson"
}, {
"id" : 2,
"firstName" : "Jim",
"lastName" : "Morrisson"
}, {
"id" : 3,
"firstName" : "John",
"lastName" : "Lennon"
}, {
"id" : 4,
"firstName" : "Freddie",
"lastName" : "Mercury"
}, {
"id" : 5,
"firstName" : "Elton",
"lastName" : "John"
} ] Applying SplitJSON to the convertrecord processor with jsonpath = $.* creates 10,000 splits in the queue refer image SplitJSON10000splits I need to split the array of JSON into individuall JSON records and apply some transformation to these records
... View more