Member since
07-19-2018
613
Posts
101
Kudos Received
117
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4901 | 01-11-2021 05:54 AM | |
3337 | 01-11-2021 05:52 AM | |
8642 | 01-08-2021 05:23 AM | |
8157 | 01-04-2021 04:08 AM | |
36034 | 12-18-2020 05:42 AM |
07-16-2021
07:12 AM
Hi @stevenmatison, I don't see any data in the cassandra database. I don't see any errors either. Here are the screenshots.
... View more
04-14-2021
10:35 AM
We had the hanging concurrent tasks problem running Nifi 1.11.4. Upgrading to 1.13.2 resolved it for us.
... View more
04-08-2021
04:25 AM
Hi, We need Ambari 2.7, Is it still accessible for Public..Please do mention where to get the repository. Thanks. Narendra
... View more
01-19-2021
04:30 AM
@singyik Yes. I believe that is the last free public repo. Who knows how long it will remain. If you are using it i would recommend to fully copy and use the copy.
... View more
01-11-2021
05:54 AM
You must have the reader incorrectly configured for your CSV schema.
... View more
01-11-2021
05:52 AM
2 Kudos
@Lallagreta You should be able to define the filename, or change the filename to what you want. That said the filename doesnt dictate the type, so you can have parquet saved as .txt. One recommendation I have is to use parquet command line tools during the testing of your use case. This is the best way to validate that files are looking right, have the right schema, and right results. https://pypi.org/project/parquet-tools/ I apologize i do not have any exact samples, but from my recall of a year ago, you should be able to get simple commands to check schema of a file, and another command to show the data results. You may have to copy your hdfs file to local file system to inspect them from command line. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more
01-08-2021
05:23 AM
2 Kudos
@murali2425 The solution you are looking for is QueryRecord configured with a CSV Record Reader and Record Writer. You also have UpdateRecord and ConvertRecord which can use the Readers/Writers. This method is preferred over splitting the file and adds some nice functionality. This method allows you to provide a schema for both the inbound csv (reader) and the downstream csv (writer). Using QueryRecord you should be able to split the file, and set attribute of filename set to column1. At the end of the flow you should be able to leverage that filename attribute to resave the new file. You can find some specific examples and configuration screen shots here: https://community.cloudera.com/t5/Community-Articles/Running-SQL-on-FlowFiles-using-QueryRecord-Processor-Apache/ta-p/246671 If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more
01-05-2021
01:34 PM
ARRAYs are a bit tricky. But JSONReader and Writer may work better.
... View more
01-05-2021
10:49 AM
@kiranps11 Did you add and start a "DistributedMapCacheServer" controller service running on port 4557? The "DistributedMapCacheClientService" controller service only creates a client that is used to connect to a server you must also create. Keep in mind that the DistributedMapCacheServer does not offer High Availability (HA). Enabling this controller services will start a DistributedMapCacheServer on each node in your NiFi cluster, but each of those servers do not talk to each other. This is important to understand since you have configured your DMC Client to use localhost. This means that each node in your cluster would be using its own DMC server rather than a single DMC server. For a HA solution you should be using an external map cache via one of the other client offerings like "HBase_2_ClientMapCacheService " or "RedisDistributedMapCacheClientService", but this would require you to setup that external HBAs or Redis server with HA yourself. Hope this helps, Matt
... View more
01-04-2021
05:07 AM
@schnell Glad you were able to find the remnant that blocked re-install. Here is my SO reply, which gives some details about how to completely remove HDP and components from node filesystem... With ambari, any service that is deleted with the UI, will still exist on the original node(s) the service was installed on. You would need to manually remove them from the node(s). This process is hard to find documentation on, but basically goes as follows: Delete the application from file system locations such as /etc/ /var/ /opt/ etc Remove user accounts/groups You can find some more details in this blog post here which goes into some of the detail for completely removing HDP. Just follow steps for single service. https://henning.kropponline.de/2016/04/24/completely-uninstall-remove-hdp-nodes/ https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.3.2/bk_installing_manually_book/content/ch_uninstalling_hdp_chapter.html https://gist.github.com/hourback/085500397bb2588964c5 If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more