About ask_bill_brooks

ask_bill_brooks · ‎06-17-2022

@Uday_Singh2022 There won't be any documentation available from Cloudera regarding Flume because that lack of availability is part of what "not a supported component" means. If you must use Flume, then the documentation on the Apache site that @mszurap pointed you to above is your only option.

ask_bill_brooks · ‎06-15-2022

@Techie123 I strongly recommend that you read over this thread: Problem login for first time in Nifi …and then if you try all the troubleshooting steps described there, post again here, in this thread, and provide a link to the instructions you followed in order to perform your installation and for which version of NiFi.

ask_bill_brooks · ‎05-24-2022

Previously asked and answered in this thread: log4j2 vulnerability (CVE-2021-44228)

ask_bill_brooks · ‎05-17-2022

@joshtheflame I just wanted to provide a bit more context. The partial page shot you've included above appears to show Cloudera Manager running against a CDH 6.1.0 cluster. CDH 6.1.0 was released in December of 2018. As you no doubt are aware, that was quite a while ago, especially in terms of "internet time". Hopefully you are aware that CDH 6.1.x has reached its End of Support (EoS). You can find the most recent official reminder of the previous announcement that Cloudera Enterprise 6.x reached End of Support (EoS) in 2021 here: March 2021 Customer Advisory - 2: End of Support for Cloudera Products (CDH/CM 6.x & HDP 3.x). Cloudera's lifecycle support policies are documented here: https://www.cloudera.com/legal/policies/support-lifecycle-policy.html My understanding is that organizations with a valid Cloudera subscription for legacy products such as CDH would have been sent this announcement directly. If that screenshot represents what your bank is running in production, I would recommend that you reach out to your Cloudera Account team and discuss your upgrade options as soon as possible. You are going to have to upgrade in order to take advantage of any of the offerings mentioned in @steven-matison 's reply earlier.

ask_bill_brooks · ‎05-06-2022

Hi @sparkdeveloper You are encountering the well-known phenomenon of Java Class shadowing, and it really doesn't have all that much to do with Spark, or the spark-submit utility. One direct solution is to employ a technique called "shading" the dependencies, which usually winds up being a lot of work, and I personally avoid doing it. If you want to go that route, you can read all about that technique on any number of websites that cover Java software development or in books on the same topic. Obviously, I don't know what the code you're running that uses this library does, but if you're looking for some way to resolve this quickly, I think you are better off rewriting your code that uses the 3.2.0 version of this library to use the APIs in the version that comes pre-packaged with your spark installation and just omit adding your .jar file to the CLASSPATH. You'll probably need to download the HikariCP-2.5.1.jar file and install it where you're doing your development work. I realize that is a lazy kind of answer, but you should consider it unless you really have a compelling need to use the later version.

ask_bill_brooks · ‎05-06-2022

@aval I don't personally have the required expertise (yet) to answer your question, but I did want to attempt to clarify the question for other community members who do. In paragraph 2, you write: We want to setup the data experiences so we can move some work between the cdp public cloud and the cdp private cloud experiences since the control planes are similar. I think, based on the rest of your question, that you intended to write "…so we can move some work between cdp private cloud base and the cdp private cloud experiences since the control planes are similar.". It's important to be clear about this because Cloudera does have a product called CDP public cloud, but that "form factor" of CDP only works on the infrastructure provided by the so-called hyperscalers, or CSPs such as AWS, Azure and GCP. The on-premises offering is called CDP Private Cloud. Please update this thread with a new, clarifying post if you really want to know how to move workloads between cdp public cloud and cdp private cloud experiences clusters.

ask_bill_brooks · ‎04-28-2022

Hi @azg , Do you know what version of NiFi is included in the docker image you're running? And when you say "the connection works well on localhost", can you expand on what exactly you did to confirm that the connection "works"?

ask_bill_brooks · ‎04-26-2022

Hi @EmanuelArano Keeping abreast of the fast moving and ever-updating collection of requirements in terms of supported operating systems, Database Management Systems, Java Development Kits and/or primary processor architectures compatible for use with Cloudera Data Platform (CDP) is quite a challenge and I doubt any members of the Cloudera Community keeps track of all of that in their head. Luckily, you don't have to. I recommend you refer to the section subheaded CDP Private Cloud Base Supported Operating Systems …in the documentation for the specific release you're interested in (you didn't say which specific version of CDP Private Cloud Base you want to install, and there are as of this writing eight different "point releases" of CDP Private Cloud Base 7.1). As a point of reference, CDP Private Cloud Base versions 7.1.2 and 7.1.3 were released in the Fall of 2020, so hopefully you are not attempting to install 7.1.0 at this point. In particular, that section features a hyperlink to the very handy Cloudera Support Matrix, on which you can click on the product (in your case, Cloudera Manager and CDP Private Cloud Base) to see all the product versions that it supports. To narrow down your search for supported combinations, click again on the supported product versions that are highlighted in green. You can then scroll down to see the supported Operating Systems (along with Databases and JDKs). Should you find, after consulting that documentation, that "redhat 8.5" is not supported, and you have full backups available to you, the best approach would be to restore your system to the state it was in prior to the yum update and proceed with your installation using 8.4 as the operating system. If for some reason you don't have full backups available, you might want to explore using the yum history command to roll back that last OS update.

ask_bill_brooks · ‎04-21-2022

@san_re What documentation are you following for what you are attempting to do here? You're much better off following a specific set of instructions from the site where you are downloading Mysql and/or NiFi from. For NiFi, the canonical instructions can be found here: NiFi System Administrator's Guide

ask_bill_brooks · ‎04-20-2022

Hi @Data1701 According to the API documentation, one can get a java.net.URISyntaxException when a passed string could not be parsed as a URI reference. The file you are attempting to read in might very well be available on your local area network from a shared server drive, but it isn't available via a valid URI, or at the very least, the URI you are referencing in your Spark code isn't a valid and accessible URI. What your problem boils down to is that the file isn't available via a web server, and the server that is running your Spark code can't retrieve it at the time your code executes. And that should shed light on why you had to previously upload your .csv files into CDSW, because that was the way to ensure that they could be found at runtime, since they were in a well-known/accessible location. There are several valid approaches to addressing this, but the easiest solution, if you want to continue to use the code snippet you've written and shared here, is to place the file on some server that is accessible over the web (preferably via HTTPS) and refer to it using a fully-qualified URL. In order to do that, a functioning and secured web server will have to be available to you (you could set this up on your local workstation). Let's assume you place the file on a web-accessible server somewhere local to your corporate network and the web-accessible directory path you place the file in turns out to be something like Data1701/project/data_folder/. Then you can change the assignment statement in your Spark code to this: df = spark.read.format('csv').load('https://web.dept.yourcompany.com/Data1701/project/data_folder/file.csv', header=True) …and the rest of your code should work, unchanged.

Member Since	‎07-29-2019 03:29 PM
Last Visited
Posts	640
Kudos received	109

Cloudera Community

Re: Vulnerability (Text4Shell) (CVE-2022-42889)

Re: ERROR orm.CompilationManager: Sqoop requires a...

Re: How to enable TEZ UI on CDP 7.1.7

Re: CDH HIVE download

Re: Nifi registry architecture.

Re: Apache Flume required to be run in CDP environ...

Re: NiFi Installation on co-operate Linux Server g...

Re: log4j2 failure (CVE-2021-44228) in cloudera

Re: Cloudera Services on Kubernetes

Re: spark's HikariCP-2.5.1.jar is eclipsing the de...

Re: CDP private cloud experience cluster requireme...

Re: Nifi: Failed to retrieve directory listing whe...

Re: Install cloudera in redhat 8.5

Re: Not able to Connect NIFI to mysql through wind...

Re: How to read in a csv file from server location...