About mburgess

mburgess · ‎12-08-2016

Your original URL "jdbc://hive2://localhost:10000/default" has slashes between jdbc: and hive2, this should instead be jdbc:hive2://localhost:10000/default For the zookeeper version of the URL, that is a known issue (NIFI-2575), I would recommend correcting the original URL and using that (port 10000 should be opened/forwarded on the sandbox already).

mburgess · ‎12-07-2016

There is a table for JDBC type values here, for your data types they are as follows: Type Value INTEGER 4 TIMESTAMP 93 DOUBLE 8 LONGVARCHAR -1

mburgess · ‎12-06-2016

Each processor is responsible for reading and writing whichever attributes it wants to for the purposes of its processing, and those attributes are available in each processor's documentation. SplitJson for example writes the following attributes to each output flow file: Name Description fragment.identifier All split FlowFiles produced from the same parent FlowFile will have the same randomly generated UUID added for this attribute fragment.index A one-up number that indicates the ordering of the split FlowFiles that were created from a single parent FlowFile fragment.count The number of split FlowFiles generated from the parent FlowFile segment.original.filename The filename of the parent FlowFile These were added to NiFi 1.0.0 (HDF 2.0) under NIFI-2632, so if you are using a version of NiFi/HDF before that, that's why you won't see these attributes populated by SplitJson.

mburgess · ‎12-05-2016

@Arsalan Siddiqi PutElasticsearch and FetchElasticsearch support Elasticsearch 2.X clusters, it looks like you are trying to use ES 5.x. NiFi 1.1.0 and HDF 2.1.0 will have Elasticsearch 5 processors (PutElasticsearch5 and FetchElasticsearch5). In the meantime the Http processors should connect to both ES 2.X and 5.X clusters

mburgess · ‎12-05-2016

Both PutElasticsearch and FetchElasticsearch use the transport client. There are Http versions of both (PutElasticsearchHttp and FetchElasticsearchHttp) that use the REST API.

mburgess · ‎12-02-2016

What error(s) are you seeing? If it mentions Avro, then if your column names are in Chinese, it's likely that Avro does not accept them. This may be alleviated in NiFi 1.1.0 with NIFI-2262, but it would just replace non-Avro-compatible characters with underscores, so you may face a "duplicate field" exception. In that case you would need column aliases in your SELECT statement to use Avro-compatible names for the columns.

mburgess · ‎11-30-2016

Depending on how fast the flow files are coming through, using the timestamp might result in the same filename as well. You could use ${uuid}, which is the UUID of the flow file (guaranteed to be unique), or ${nextInt()}, which is an auto-incrementing value

mburgess · ‎11-28-2016

To follow up on your first question above, Groovy has many features and supports multiple paradigms such as OO (like Java), functional programming, etc. Also it can be used as a scripting language, which is why you don't often see explicitly-defined classes and methods in the example scripts (both in @Artem Ervits answer and the examples on my blog). For ExecuteScript, the Script Body is treated like a script, so it can be evaluated without needing to define a top-level class. Under the hood, Groovy wraps the script in a Script object (such that it obeys the JVM rules). For your second question, you can add additional logic inside the inputStream.eachLine closure to do any transformations. So instead of just writing out a[1] a[2] after you've tokenized the input stream, you can do additional things such as: if(a[2].startsWith('3300')) a[2] = a[2].replaceFirst('33','22') if(a[2].startsWith('0')) a[2] = a[2].replaceFirst('0','00212') Then you can output the space-separated columns.

mburgess · ‎11-22-2016

Yes, PutSQL accepts any SQL statement (except Callable statements like stored procedures) that does not return a result set, so DDL/DML commands such as LOAD INTO or CREATE TABLE, etc. are supported.

mburgess · ‎11-21-2016

This is a bug, I have filed it under NIFI-3076. Workarounds might include changing the domain_id column to be signed (which is probably not desired), or using a cast() function to convert it into a data type (long, e.g.) that will be handled better for the time being. If you use a cast() or other function, you may want a column alias to ensure the column/field name is the one you want.

Online	Offline
Last Visited	‎10-29-2025 03:45 PM

Member Since	‎11-16-2015 02:21 PM
Last Visited	‎10-29-2025 03:45 PM
Posts	905
Kudos received	658

Cloudera Community

Re: Compare data within the JSON using NIFI

Re: how to join three csv files like sql on condit...

Re: How to see the Data Provenance and Lineage in ...

Re: Apache NiFi - RouteText has no matches

Re: Nifi Building error when creating a brand new ...

Re: PutHiveQL Exception

Re: NiFi: PutHiveQL processor properties

Re: Where are Nifi attributes written?

Re: PutElasticHttp Connection reset by peer: socke...

Re: PutElasticHttp Connection reset by peer: socke...

Re: How to insert data into Hive using NiFi ?

Re: NIFI splittext to split the single file into m...

Re: Groovy ExcuteScript example ?

Re: SQL loader using PutSQL

Re: Nifi 1.0: ExecuteSQL having issues with UNSIGN...