Member since
01-07-2019
220
Posts
23
Kudos Received
30
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4913 | 08-19-2021 05:45 AM | |
1788 | 08-04-2021 05:59 AM | |
865 | 07-22-2021 08:09 AM | |
3621 | 07-22-2021 08:01 AM | |
3322 | 07-22-2021 07:32 AM |
08-29-2019
01:02 AM
If you are putting the data in HDFS first, I assume the following python script is more batch than streaming. In that case, consider running it via a scheduler like Oozie. Also, if you run into scalability issues with your script, consider using something like pyspark instead.
... View more
08-28-2019
11:15 AM
As explained elsewhere by Andy: You can accomplish this with a ConvertRecord processor. Register an Avro schema describing the expected format in a Schema Registry (controller service), and create a CSVReader implementation to convert this incoming data to the generic Apache NiFi internal record format. Similarly, use a CSVRecordSetWriter with your output schema to write the data back to CSV in whatever columnar order you like. For more information on the record processing philosophy and some examples, see Record-oriented data with NiFi and Apache NiFi Records and Schema Registries.
... View more
08-28-2019
10:37 AM
I just checked with @MattWho and it seems this explanation is still relevant. Of course it does assume you are using Ambari: https://community.cloudera.com/t5/Community-Articles/HDF-2-x-Adding-a-new-NiFi-Node-to-an-existing-secured-NiFi/ta-p/249284
... View more
08-28-2019
10:09 AM
The normal way to process Excel files on HDFS would be with just these NiFi processors, you would not need python: ListHDFS>FetchHDFS>ConvertExcelToCSV>PutHDFS I would recommend you to try this, the documentation does not mention explicitly whether this works with XLSB, so you may actually need the python script for the conversion. In this case the ExecuteStreamCommand processor would indeed be a logical choice. ----- Regarding the output of the first processor: In development, I find the most convenient way to see the output, is by stopping the downstream processor and then right clicking on the que to list it and inspect the messages. If stopping the queue is not possible, you could also investigate via the provenance view.
... View more
06-04-2019
01:29 AM
I found this resolution elsewhere: The problem is resolved after i copied the .class file from /tmp/sqoop-hduser/compile/ to hdfs /home/hduser/ and also the current working directory from where i am running sqoop. In case that does not work, this should help you get moving: Specify the --bindir where the compiled code and .jar file should be located. Without these arguments, Sqoop would place the generated Java source file in your current working directory and the compiled .class file and .jar file in /tmp/sqoop-<username>/compile. With an example: sqoop import --bindir ./ --connect jdbc:mysql://localhost/hadoopguide --table widgets
... View more
05-20-2019
05:45 AM
Hello, I also replied in the thread, but 5.14 is newer than 5.6 so that should be fine! Kind regards, Dennis
... View more
05-20-2019
05:43 AM
@michalr I believe you tried to reach out to me during my vacation, with a question about versions. It seems you already found the solution, but for clarity: The current version of the docs https://docs.hortonworks.com/HDPDocuments/CFM/CFM-1.0.0/installation/content/cfm-system-requirements.html refers to 5.5.6 and above. If you currently have 5.14.x that should be fine! If in doubt, it is best to ensure you ask questions in public as I may not see messages personally in a reasonable amount of time!
... View more
05-06-2019
05:34 AM
Since the question was asked, the situation has changed. As soon as Hortonworks and Cloudera merged, NiFi became supported by Cloudera. Shortly after the integrations with CDH were also completed, so that NiFi is now a fully supported and integrated component. Please look into the documentation for the latest info at any time, but in general Cloudera Manager is now able to install NiFi.
... View more
05-06-2019
04:59 AM
Since the question was asked, the situation has changed. As soon as Hortonworks and Cloudera merged, NiFi became supported by Cloudera. Shortly after the integrations with CDH were also completed, so that NiFi is now a fully supported and integrated component. Please look into the documentation for the latest info at any time, but in general Cloudera Manager is now able to install NiFi.
... View more
05-06-2019
04:53 AM
1 Kudo
Since the question was asked, the situation has changed. As soon as Hortonworks and Cloudera merged, NiFi became supported by Cloudera. Shortly after the integrations with CDH were also completed, so that NiFi is now a fully supported and integrated component. Hence the question already contains the answer: NiFi is the Cloudera answer for solving these usecases.
... View more