Member since
01-07-2019
220
Posts
23
Kudos Received
30
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5044 | 08-19-2021 05:45 AM | |
1811 | 08-04-2021 05:59 AM | |
879 | 07-22-2021 08:09 AM | |
3691 | 07-22-2021 08:01 AM | |
3429 | 07-22-2021 07:32 AM |
06-04-2019
01:29 AM
I found this resolution elsewhere: The problem is resolved after i copied the .class file from /tmp/sqoop-hduser/compile/ to hdfs /home/hduser/ and also the current working directory from where i am running sqoop. In case that does not work, this should help you get moving: Specify the --bindir where the compiled code and .jar file should be located. Without these arguments, Sqoop would place the generated Java source file in your current working directory and the compiled .class file and .jar file in /tmp/sqoop-<username>/compile. With an example: sqoop import --bindir ./ --connect jdbc:mysql://localhost/hadoopguide --table widgets
... View more
05-20-2019
07:24 AM
Hi Dennis, As mentioned in the (edited) post, the solution suggested above finally worked for me. Thanks again for the help! Regards, Michal
... View more
05-06-2019
07:33 AM
Hello guys, Yeah, that was a long time ago,I managed to get the job by using the following framework : Logstash -> Kafka -> Spark
... View more
05-06-2019
05:34 AM
Since the question was asked, the situation has changed. As soon as Hortonworks and Cloudera merged, NiFi became supported by Cloudera. Shortly after the integrations with CDH were also completed, so that NiFi is now a fully supported and integrated component. Please look into the documentation for the latest info at any time, but in general Cloudera Manager is now able to install NiFi.
... View more
05-06-2019
04:53 AM
1 Kudo
Since the question was asked, the situation has changed. As soon as Hortonworks and Cloudera merged, NiFi became supported by Cloudera. Shortly after the integrations with CDH were also completed, so that NiFi is now a fully supported and integrated component. Hence the question already contains the answer: NiFi is the Cloudera answer for solving these usecases.
... View more
04-10-2019
04:15 AM
1 Kudo
Assuming you want to access the data via spark, then the main question is how it should be stored. For this Drill is not supported, but Hive tables and Kudu are supported by Cloudera. Now it boils down to whether you want to store the data in Hive or in Kudu, as Spark can work with both of these. If you want to insert your data record by record, or want to do interactive queries in Impala then Kudu is likely the best choice. If you want to insert and process your data in bulk, then Hive tables are usually the nice fit.
... View more
04-10-2019
02:34 AM
Though I don't know how it works exactly under the hood, I can confirm that it will work on the source DB side. (As it will definitely NOT simply pull everything from the DB, and then chop it up before writing to Hadoop.) If you are looking for the optimum, you are likely going to need some trial and error. However, as a starting point I understand that the default value is 1000, and that you may want to try 10000 as a first step towards better performance.
... View more
04-09-2019
07:41 AM
1 Kudo
There is something very unusual happening here. Based on your outputs, values are not only ending up in the wrong columns, but you are even getting different values! In the 'correct' record, you have 5686.76, and in the 'wrong' record you have -5686.76. My first guess was that there is a mistake in how you send data to the appropriate columns, but I don't see how that can explain a minus sign changing position. To troubleshoot something like this, it is really important to dig into the details. I would therefore recommend you to bring your question down to a 'Minimal reproducible example'. Eliminating any complexity that is not causing unexpected results. For example: You show a load command to get data into spark, consider replacing it with an actual string (and make sure to check whether the string allows you to reproduce the problem). You also show 2 writes, but if we have the exact input and code to reproduce the problem the correct answer is probably not relevant. Also, you use some code to list columns, consider hardcoding it first. As mentioned, really try to take out all complexity untill we land on a minimal amount that still reproduces the problem. Hopefully you will already see the answer once you have eliminated all the distractions, and if not you will have a fully trimmed down version, which you can use to update your question here!
... View more
08-08-2018
01:33 PM
1 Kudo
As indicated in an existing answer by @hduraiswamy there are some things you can do: 1. Give multiple insert commands in parallel, and they will automatically be executed sequentially 2. Writing multiple files to a directory and then creating a hive table on top of the folder, see the aforementioned answer If this does not work for you, you can of course also work with a non-external hive table.
... View more
01-15-2019
11:24 AM
First of all doublecheck all configurations (incl. password). Just to avoid moving in the right direction. Secondly confirm that you do not need TLS enabled. If these don't help, the following might help with troubleshooting: 1. Become nifi on the node where nifi is running 2. Send the message via Python 3. Share the python command here Note: Please explicity specify all things that you configure in nify when executing python (even if they are not needed because of good defaults for instance).
... View more
- « Previous
- Next »