Member since
07-07-2020
15
Posts
3
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3434 | 12-14-2020 06:42 AM | |
1190 | 11-06-2020 04:57 AM | |
11107 | 08-31-2020 05:14 AM | |
3185 | 07-20-2020 11:27 PM |
02-12-2021
10:09 AM
Hi @kc251997 Cloudera recently changed the download policy. Yes, you must have a valid subscription in order to download the software. Please see the announcement here: https://www.cloudera.com/downloads/paywall-expansion.html …where the sentence most pertinent to your question ("I only want CDH, Hive, Impala Free versions for testing") is as follows: Effective January 31, 2021, all Cloudera software will require a valid subscription and only be accessible via the paywall. This includes all legacy versions for Cloudera Distribution including Apache Hadoop (CDH), Hortonworks Data Platform (HDP), Data Flow (HDF/CDF), and Cloudera Data Science Workbench (CDSW). Toward the end of that announcement, you will see a section titled Frequently Asked Questions which explains the credentials and how to properly obtain them if you don't have them already.
... View more
02-11-2021
11:22 PM
Druid is not part of CDH 6.2.1 packaging. You have to install it manually or using Cloudera Manager by using Cloudera Manager Extensions (creating your own parcel and CSD).
... View more
02-11-2021
11:10 PM
Since you are using Ambari, you can you can try to use Rebalance HDFS action, or directly the Hadoop Balancer tool.
... View more
02-11-2021
10:54 PM
Yes, you can set up a local mirror for offline installation. It is possible with CDP version and with older CDH 5 and 6 version. But even theses legacy versions are behing paywall now.
... View more
12-15-2020
09:55 AM
Hi @Kezia, Thanks for the solution, Thank you so much. It's working on CDH-6.3.1 version also. kudos.
... View more
12-14-2020
07:05 AM
@toutou From your hdfs cluster you need hdfs-site.xml and correct configuration for PutHDFS. You may also need to satisfy creating a user with permissions on the hdfs location. Please share PutHDFS processor configuration, and error information to allow community members to respond with specific feedback required to solve your issue. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more
11-18-2020
02:44 AM
Thank you for the reply @Kezia . I was able to filter the duplicates using detect duplicate processor. This is the error I'm getting when getsftp processor was scheduled on primary node GetFTP[id=xxxx] Unable to fetch listing from remote server due to java.net.ConnectException: Connection timed out (Connection timed out): Connection timed out (Connection timed out)
... View more
11-06-2020
04:57 AM
Hello, I'm not sure to understand your problem, but i'll try to answer to "Files ingested into HDFS but I don't see Files in HDFS". When you process a flowfile with a processor, the processor will route the flowfile to relationship according with what it managed to do with the flowfile. The PutHDFS processor have two relationship : success and failure. If the processor managed to put the flowfile to HDFS, the flowfile is routed to the relationship success and you can continue to process it. This unless you check automatically terminate the relationship success, then NiFi won't do anything more with the flowfile. Same for the failure relationship. Here you automatically terminate both success and failure relationship. For success relationship this is ok, because your file was put to HDFS. But for failure relationship, the file is not put to HDFS and you don't handle the failure. Just route the failure relationship to any other processor, you don't even need to use it at this point, you just want to see if flowfile are route to the failure relationship, and so if the queue corresponding to this relationship is filled. You can also see the failure if a red error box appear in the top right corner of the processor, but previous method if to understand the concept of relationship routes. If so, the flowfile are not put to HDFS, probably because of a bad configuration of the PutHDFS processor. In this case, i'll be happy to help you with this configuration (properties tab).
... View more
09-17-2020
11:54 PM
1 Kudo
Hello, one way to do it is to use the ReplaceText processor to add a line break after each object, and so having one line for each object in your FlowFile. To do so just replace "}{" by "} -linebreak- {". Note that you have to escape brackets in your Search Value : \}\{ And that you have to use Shift+Enter in your Replacement Value to add the line break : }
{ Then just use the SplitText processor with Line Count Split to 1 to split your input flow file into one flow file for each line. Hope it helps 🙂
... View more
09-15-2020
09:51 AM
1 Kudo
Actually, both replies can be considered as valid. I confirmed that one, which better fits to my use case.
... View more