Member since
02-16-2016
176
Posts
197
Kudos Received
17
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3073 | 11-18-2016 08:48 PM | |
5472 | 08-23-2016 04:13 PM | |
1501 | 03-26-2016 12:01 PM | |
1432 | 03-15-2016 12:12 AM | |
15353 | 03-14-2016 10:54 PM |
11-18-2016
08:48 PM
I finally figured it out. Nifi node was unable to talk to cluster because of entry in hosts file. It was looking for partial hostname not fully qualified domain name.
... View more
11-17-2016
08:44 PM
I have a HDF 2.0 node that was running in standalone mode. When I convert it to cluster mode, by changing nifi.properties file and restarting HDF 2.0, I am getting following error message. Cluster is still in the process of voting on the appropriate Data Flow. I have removed existing flow.xml.gz from conf directory and working with an empty flow file.
... View more
Labels:
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
08-23-2016
07:00 PM
1 Kudo
@Adi Jabkowsky You are trying to convert Avro data directly to text. You need to first convert Avro to a text format and then extract text value before using ReplacementText processor. You can use processors pipeline as shown. 1. Convert Avro output to JSON 2. Split JSON to handle multiple lines from output. 3. Use EvaluateJSON processor to get individual column from output and set it to attributes in flowfile. 4. Use ReplacementText processor to generate an insert statement and then use putHiveQL processor. Other option is to generate output in CSV format and then use regular expression to read column values.
... View more
08-23-2016
04:13 PM
1 Kudo
@Jay See I could get it working without SSLContext Service. Please see attached. Can you please post screenshot of your InvokeHttp processor with configuration ?
... View more
07-19-2016
02:32 PM
1 Kudo
@BigDataRocks @mclark You will need to plan your nifi production cluster based on your volume requirements. - If you are just looking to transfer huge volume of data from a source to sink, you need to ensure you have enough space available for content repository. Also, ensure that your content repository is setup on a separate disk from flowfile and provenance repository. - Also from productionizing perspective, it is important to have error handling built in your flows, so your teams can get notified in case of errors and any errors are logged to logs. - It will probably be better to run multiple instances of nifi for data from multiple sources, because currently nifi doesn't offer security based on flows. In current security model, flow administrator will have access to all flows running on one instance. By running multiple instances of nifi, you can control security for each flow. - If you decide to use one instance of nifi, you can use ProcessGroups to organize your dataflows. - You should also think about setting up MonitorTasks for disk usage and memory that will give you warnings at appropriate thresholds. - For dataflows, with significant processing requirements, you will need a cluster setup to distribute load across different nodes. You can also increase number of concurrent tasks for any processor that requires more processing power.
... View more
07-19-2016
01:51 PM
@Sreekanth Munigati , As @Bryan Bende mentioned, there is no direct way of manipulating Avro data, but in your case you can try modifying SQL being executed by ExecuteSQL processor to add an additional column in SQL itself.
... View more
06-22-2016
02:34 AM
Thank You @Bryan Bende for creating JIRA for this issue. Are there any workarounds to this, till this issue gets resolved in a new release ? I can only think of creating a Hive table with appropriate column types, then writing a select query to transform to correct data type and inserting in new table. But this requires a post process and breaks real time ingestion.
... View more
06-17-2016
04:03 PM
@Bryan Bende Thanks Bryan. I double checked all the details. My ORACLE table has NUMBER columns. Here are the settings that I am using to establish DBCPConnectionPool. But my Avro data has all NUMBER columns coming as strings. Is there any way to look at details of how NiFi is generating Avro fields formats ? oracleconnection.jpg
... View more
06-16-2016
07:09 PM
ExecuteSQL processor currently generates data in Avro format. While fetching data from Oracle, generated Avro format currently converts any data type to string format. Is there a way to ensure that it retains data type of original column or an equivalent type ?
... View more
Labels:
- Labels:
-
Apache NiFi
04-25-2016
08:33 PM
15 Kudos
Easily convert any XML document to JSON format using TransformXML processor. Save following stylesheet in a file. Use TransformXML processor and specify xslt stylesheet. It will convert any XML document to JSON format. <?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">{
<xsl:apply-templates select="*"/>}
</xsl:template>
<!-- Object or Element Property-->
<xsl:template match="*">
"<xsl:value-of select="name()"/>" : <xsl:call-template name="Properties"/>
</xsl:template>
<!-- Array Element -->
<xsl:template match="*" mode="ArrayElement">
<xsl:call-template name="Properties"/>
</xsl:template>
<!-- Object Properties -->
<xsl:template name="Properties">
<xsl:variable name="childName" select="name(*[1])"/>
<xsl:choose>
<xsl:when test="not(*|@*)">"<xsl:value-of select="."/>"</xsl:when>
<xsl:when test="count(*[name()=$childName]) > 1">{ "<xsl:value-of select="$childName"/>" :[<xsl:apply-templates select="*"
mode="ArrayElement"/>] }</xsl:when>
<xsl:otherwise>{
<xsl:apply-templates select="@*"/>
<xsl:apply-templates select="*"/>
}</xsl:otherwise>
</xsl:choose>
<xsl:if test="following-sibling::*">,</xsl:if>
</xsl:template>
<!-- Attribute Property -->
<xsl:template match="@*">"<xsl:value-of select="name()"/>" : "<xsl:value-of select="."/>",
</xsl:template>
</xsl:stylesheet> That's it !
... View more
Labels: