Member since
07-29-2019
640
Posts
113
Kudos Received
48
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
7536 | 12-01-2022 05:40 PM | |
2031 | 11-24-2022 08:44 AM | |
2943 | 11-12-2022 12:38 PM | |
995 | 10-10-2022 06:58 AM | |
1437 | 09-11-2022 05:43 PM |
05-08-2022
07:05 PM
Thanks @ask_bill_brooks . Much appreciate the reply. I did solve the problem using an uber/fat jar. However, I was hoping for an elegant solution. I did look into shading on gradle but it was too confusing and difficult to explain and maintain. sbt didn't seem to have shading (or an easier approaching to shading). At the moment, we are stuck with a fat jar. Hope there is an easier deployment method in the future via the dependencies.scala file.
... View more
05-06-2022
04:39 PM
Thanks for that! That's helpful.
... View more
05-05-2022
08:13 PM
Where can I get a copy? I am following this documentation: http://vmwareinsight.com/Articles/2020/6/5803025/How-to-install-Cloudera-On-VirtualBox-In-Windows
... View more
05-03-2022
09:27 AM
No, actually I didn't get what I need. I just understood that I need to develop it by myself
... View more
05-03-2022
08:25 AM
Hi @Freschone
May I ask why you need to download Quickstart VM based on CDH 5.10? Is this a classroom assignment?
As a general matter, Cloudera is no longer updating or making the Cloudera Quickstart VM available for download (and hasn't since March of 2020) because it was outdated and obsolete as the last version was based on CDH 5.13, which went out of support in the Fall of 2020.
The credentials to access the private repository where Cloudera is now distributing previous versions of CDH are not are not generally the same ones to access Cloudera's website or the Cloudera community. Employees of organizations with a valid Cloudera subscription can generate repository credentials from a CDH license key, and there is a full description of how to do this in the Cloudera Enterprise 6.x Release Notes here: Version, Packaging, and Download Information.
... View more
04-29-2022
02:22 AM
The latest version of Nifi is running (1.16.0) When I say the connection works fine in localhost, I mean that my Nfii service is launched via docker-compose on my computer. When I access Nifi via https://localhost:8443/nifi/ and use a ListenFTP processor on port 2221, the connection via FileZila works. i can transfer files and treat them in Nifi. Localhost : FileZila connection :
... View more
04-21-2022
09:53 AM
@san_re What documentation are you following for what you are attempting to do here? You're much better off following a specific set of instructions from the site where you are downloading Mysql and/or NiFi from.
For NiFi, the canonical instructions can be found here:
NiFi System Administrator's Guide
... View more
04-20-2022
03:55 PM
Hi @Data1701
According to the API documentation, one can get a java.net.URISyntaxException when a passed string could not be parsed as a URI reference.
The file you are attempting to read in might very well be available on your local area network from a shared server drive, but it isn't available via a valid URI, or at the very least, the URI you are referencing in your Spark code isn't a valid and accessible URI.
What your problem boils down to is that the file isn't available via a web server, and the server that is running your Spark code can't retrieve it at the time your code executes. And that should shed light on why you had to previously upload your .csv files into CDSW, because that was the way to ensure that they could be found at runtime, since they were in a well-known/accessible location.
There are several valid approaches to addressing this, but the easiest solution, if you want to continue to use the code snippet you've written and shared here, is to place the file on some server that is accessible over the web (preferably via HTTPS) and refer to it using a fully-qualified URL. In order to do that, a functioning and secured web server will have to be available to you (you could set this up on your local workstation).
Let's assume you place the file on a web-accessible server somewhere local to your corporate network and the web-accessible directory path you place the file in turns out to be something like Data1701/project/data_folder/. Then you can change the assignment statement in your Spark code to this:
df = spark.read.format('csv').load('https://web.dept.yourcompany.com/Data1701/project/data_folder/file.csv', header=True)
…and the rest of your code should work, unchanged.
... View more
04-13-2022
12:13 PM
@ask_bill_brooks thanks for the information. Everything you said made sense. I will wait and see if anybody else has had better luck than I, but I think you are correct.
... View more