Member since
07-19-2018
613
Posts
99
Kudos Received
116
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1575 | 01-11-2021 05:54 AM | |
1007 | 01-11-2021 05:52 AM | |
1971 | 01-08-2021 05:23 AM | |
2277 | 01-04-2021 04:08 AM | |
9742 | 12-18-2020 05:42 AM |
06-15-2022
01:23 AM
I have used invokeHTTP in 1.12 version without SSL cert and works fine, however new version 1.16 doesn't. Any settings available to ignore?. Ex: I connect to SQL server with JDBCConnection with property trustServerCertificate=true; post which it works without certifcates.
... View more
05-31-2022
05:29 AM
Came here via Google. Just for other people. NiFi does support multipart with the InvokeHTTP since a few releases: https://palindromicity.blogspot.com/2020/04/sending-multipart-form-data-with.html
... View more
03-02-2022
12:37 PM
Apparently not. The old CDH model seems gone with the introduction of CDP which appears using a pure subscription based model (i.e. without the old open source model co-existing as in the old CDH model). Of course, most components in CDP are still open source. The question concerns CDP as a whole (like CDH in before), not individual components.
... View more
02-15-2022
01:00 AM
Hi Andrew, thanks for the tutorial. It was very useful. It's strange that the official nifi-registry documentation doesn't cover any of this. Anyway, I feel I should add a missing step to the instructions. Chances are, the "CN=localhost,OU=nifi" is not going to be valid in real-world scenarios. To figure out what exactly the user name should be, you can check the nifi-app.log where you will see a message like this: org.apache.nifi.registry.client.NiFiRegistryException: Error retrieving flow snapshot: Unknown user with identity 'CN=host1, OU=ACME, O=ACME, L=Canberra, ST=Australian Capital Territory, C=AU'. Contact the system administrator. Simply use that identity instead of "CN=localhost,OU=nifi".
... View more
01-28-2022
12:54 AM
Hello, I'm a learner & i would like to use the method you made mentioned here to collect logs in a remote server & send to Nifi. Please, can you put me through because i have been battling with how to build a msi before the real implementation. Thank you so much.
... View more
01-04-2022
09:13 AM
Hi to all. i'm having the same issue, even after following the procedure. What i did: 1) get the ssl certificate from AWS using the ssl: penssl s_client -showcerts -connect <source>:443 </dev/null | openssl x509 -text | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' 2) Copy the result to a *.crt file 3) convert the file to DEM openssl x509 -in aws_cert.crt -inform PEM -out aws_cert.der -outform DER 4) Create the jks file using keytools keytool -import -trustcacerts -alias aws3buckets -file aws_cert.der -keystore truststore-amazon.jks 5) change the permissions to be accessible from nifi. 6) add the file in the StandardSSLContextService and set password. I receive the same e SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target Do you have any clue how to solve this? I can use the crt file on the keytool? There is any version of keytool that we need to use? Thanks
... View more
11-22-2021
08:54 AM
@Yemre If you have a support contract with Cloudera, I'd recommend opening a support case to assist with your issue here. Possible causes: 1. Unsuccessful Mutual TLS handshake with NiFi-Registry from the NiFi hosts resulting in NiFi node connection only being 1-way TLS connection and node treated as an "anonymous" user. Anonymous users would could not proxy user requests and can not see anything except public buckets. --- Caused by missing complete trust chain on one or both sides of connection. Truststore in NiFi-Registry contains complete trust chain for NiFi hosts keystore PrivateKeyEntry. --- Caused by PrivateKeyEntry not meeting minimum requirements (missing SAN with NiFi hostname, missing EKU of clientAuth, and/or using wildcards are the most common) 2. NiFi-registry is configured with an identity mapping pattern in the nifi-registry.properties file that is matching on the DN from the the NiFi's client certificate presented in the mutual TLS handshake. The Identity mapping value and transform is then being applied which alters the actual client string which must be then authorized for Proxy and buckets policies. Hope this helps you, Matt
... View more
09-30-2021
10:56 PM
1 Kudo
Yes Sumita, it worked.
... View more
09-17-2021
03:19 AM
I try to install elasticsearch-6.4.2 to my cluster(HDP 3.1 Ambari 2.7.3) , intallation was completed successfully but it could not start, and the error encounterd: Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/3.1/services/ELASTICSEARCH/package/scripts/es_master.py", line 168, in <module>
ESMaster().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute
method(env)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 1011, in restart
self.start(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/3.1/services/ELASTICSEARCH/package/scripts/es_master.py", line 153, in start
self.configure(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/3.1/services/ELASTICSEARCH/package/scripts/es_master.py", line 86, in configure
group=params.es_group
File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
self.env.run()
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 123, in action_create
content = self._get_content()
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 160, in _get_content
return content()
File "/usr/lib/ambari-agent/lib/resource_management/core/source.py", line 52, in __call__
return self.get_content()
File "/usr/lib/ambari-agent/lib/resource_management/core/source.py", line 144, in get_content
rendered = self.template.render(self.context)
File "/usr/lib/ambari-agent/lib/ambari_jinja2/environment.py", line 891, in render
return self.environment.handle_exception(exc_info, True)
File "/var/lib/ambari-agent/cache/stacks/HDP/3.1/services/ELASTICSEARCH/package/templates/elasticsearch.master.yml.j2", line 93, in top-level template code
action.destructive_requires_name: {{action_destructive_requires_name}}
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/config_dictionary.py", line 73, in __getattr__
raise Fail("Configuration parameter '" + self.name + "' was not found in configurations dictionary!")
resource_management.core.exceptions.Fail: Configuration parameter 'hostname' was not found in configurations dictionary! I modified the property of discovery.zen.ping.unicast.hosts from elastic-config.xml and hostname from elasticsearch-env.xml, However, it still could not start and the same error encountered, do you have any idea?
... View more
07-21-2021
11:27 PM
@Kasun, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
... View more
07-16-2021
07:12 AM
Hi @stevenmatison, I don't see any data in the cassandra database. I don't see any errors either. Here are the screenshots.
... View more
07-16-2021
06:03 AM
For only "name", it's working fine, I get following FlowFile content: I guess the issue is within the ControllerServices. I used JSON-TreeReader but with default configuration:
... View more
04-14-2021
10:35 AM
We had the hanging concurrent tasks problem running Nifi 1.11.4. Upgrading to 1.13.2 resolved it for us.
... View more
04-08-2021
04:25 AM
Hi, We need Ambari 2.7, Is it still accessible for Public..Please do mention where to get the repository. Thanks. Narendra
... View more
01-26-2021
06:06 AM
I had the same issue. Did you get around to resolve this?
... View more
01-19-2021
04:30 AM
@singyik Yes. I believe that is the last free public repo. Who knows how long it will remain. If you are using it i would recommend to fully copy and use the copy.
... View more
01-11-2021
05:54 AM
You must have the reader incorrectly configured for your CSV schema.
... View more
01-11-2021
05:52 AM
2 Kudos
@Lallagreta You should be able to define the filename, or change the filename to what you want. That said the filename doesnt dictate the type, so you can have parquet saved as .txt. One recommendation I have is to use parquet command line tools during the testing of your use case. This is the best way to validate that files are looking right, have the right schema, and right results. https://pypi.org/project/parquet-tools/ I apologize i do not have any exact samples, but from my recall of a year ago, you should be able to get simple commands to check schema of a file, and another command to show the data results. You may have to copy your hdfs file to local file system to inspect them from command line. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more
01-08-2021
05:23 AM
2 Kudos
@murali2425 The solution you are looking for is QueryRecord configured with a CSV Record Reader and Record Writer. You also have UpdateRecord and ConvertRecord which can use the Readers/Writers. This method is preferred over splitting the file and adds some nice functionality. This method allows you to provide a schema for both the inbound csv (reader) and the downstream csv (writer). Using QueryRecord you should be able to split the file, and set attribute of filename set to column1. At the end of the flow you should be able to leverage that filename attribute to resave the new file. You can find some specific examples and configuration screen shots here: https://community.cloudera.com/t5/Community-Articles/Running-SQL-on-FlowFiles-using-QueryRecord-Processor-Apache/ta-p/246671 If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more
01-05-2021
01:34 PM
ARRAYs are a bit tricky. But JSONReader and Writer may work better.
... View more
01-05-2021
10:49 AM
@kiranps11 Did you add and start a " DistributedMapCacheServer" controller service running on port 4557? The " DistributedMapCacheClientService" controller service only creates a client that is used to connect to a server you must also create. Keep in mind that the DistributedMapCacheServer does not offer High Availability (HA). Enabling this controller services will start a DistributedMapCacheServer on each node in your NiFi cluster, but each of those servers do not talk to each other. This is important to understand since you have configured your DMC Client to use localhost. This means that each node in your cluster would be using its own DMC server rather than a single DMC server. For a HA solution you should be using an external map cache via one of the other client offerings like "HBase_2_ClientMapCacheService " or "RedisDistributedMapCacheClientService", but this would require you to setup that external HBAs or Redis server with HA yourself. Hope this helps, Matt
... View more
01-04-2021
05:07 AM
@schnell Glad you were able to find the remnant that blocked re-install. Here is my SO reply, which gives some details about how to completely remove HDP and components from node filesystem... With ambari, any service that is deleted with the UI, will still exist on the original node(s) the service was installed on. You would need to manually remove them from the node(s). This process is hard to find documentation on, but basically goes as follows: Delete the application from file system locations such as /etc/ /var/ /opt/ etc Remove user accounts/groups You can find some more details in this blog post here which goes into some of the detail for completely removing HDP. Just follow steps for single service. https://henning.kropponline.de/2016/04/24/completely-uninstall-remove-hdp-nodes/ https://docs.cloudera.com/HDPDocuments/HDP2/HDP-2.3.2/bk_installing_manually_book/content/ch_uninstalling_hdp_chapter.html https://gist.github.com/hourback/085500397bb2588964c5 If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more
12-22-2020
11:42 AM
Amazing work here sir!
... View more
12-21-2020
09:01 AM
We have some background on schema evolution in Parquet in the docs - https://docs.cloudera.com/runtime/7.2.2/impala-reference/topics/impala-parquet.html. See "Schema Evolution for Parquet Tables". Some of the details are specific to Impala but the concepts are the same across engines including Hive and Spark that use parquet tables. At a high level, you can think of the data files being immutable while the table schema evolves. If you add a new column at the end of the table, for example, that updates the table schema but leaves the parquet files unchanged. When the table is queried, the table schema and parquet file schema are reconciled and the new column's values will be all NULL. If you want to modify the existing rows and include new non-NULL values, that would require rewriting the data, e.g. with an INSERT OVERWRITE statement for a partition or a CREATE TABLE .. AS SELECT to create an entirely new table. Keep in mind that traditional Parquet tables are not optimized for workloads with updates - Apache Kudu in particular and also transactional tables in Hive3+ have support for row-level updates that is more convenient/efficient. We definitely don't require rewriting the whole table every time you want to add a column, that would be impractical for large tables!
... View more
12-18-2020
05:42 AM
1 Kudo
Excellent news. Can you accept the first 2 responses to close this solution?
... View more
12-18-2020
05:40 AM
1 Kudo
@hakansan The error is stating your hard drive is full: could not write to file "pg_logical/replorigin_checkpoint.tmp": No space left on device "no space left on device" The solution you need is to investigate cleaning out some files to free up space, expanding disk, etc. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more
12-15-2020
05:06 AM
@jainN Great looking flow. The modification you need is to simply remove json route which is combined with csv. Connect json route from Notify to FetchFile. You may need to adjust the wait/notify so that csv is released when you want. The wait/notify is often trickey, so i would recommend working with wait/notify until you understand their behavior. Here is a good article: https://community.cloudera.com/t5/Community-Articles/Trigger-based-Serial-Data-processing-in-NiFi-using-Wait-and/ta-p/248308 You may find other articles/posts here if you do some deeper research on Wait/Notify. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more
12-14-2020
01:12 PM
@jainN If you are looking to route flowfiles that end in json versus those that are not, check out RouteOnAttribute with something similar to json => ${filename:endsWith('.json')}. You would use this after using your method of choice to list/fetch the files which would then provide a $filename for every flowfile. With this json property added to RouteOnAttribute you can drag the json route to a triggering flow, and send everything else (not json: unmatched) to a holding flow. NiFi Wait/Notify should be able to provide the trigger logic, but there are many other ways to do it with out wait/notify by using another datastore, map cache, etc. For example, your non json flow could simply write to a new location and finish. Then your json flow can process that new location some known amount of time later. The logic there is your use case ofc ourse, the point is to use RouteOnAttribute to split your flow. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more
12-14-2020
07:05 AM
@toutou From your hdfs cluster you need hdfs-site.xml and correct configuration for PutHDFS. You may also need to satisfy creating a user with permissions on the hdfs location. Please share PutHDFS processor configuration, and error information to allow community members to respond with specific feedback required to solve your issue. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more
12-13-2020
10:05 PM
Yes, That's correct Answer and it works. But do we have any other workaround, as we have disabled exec due to security reasons. So how to achieve this.?
... View more