- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
HDF on Azure - guidance
- Labels:
-
Cloudera DataFlow (CDF)
Created ‎09-25-2017 08:40 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Looking for documentation on installing HDF on Azure. I see that there is no marketplace template and it will be a pure IaaS setup. This is for a PoC. Plan is to set up a 3 node NiFi-only cluster (no Kafka/Storm etc), with one management node for security/operations, leveraging Ambari to install NiFi.
Looking for guidance specifically on these areas-
- OS image to use on Azure
- Any OS level tuning/configuration that needs to be done
- Anything networking related besides Azure vnet
- Recommended foundational software with version – e.g. Java version and anything else
- Minimum config - VM SKU, disk SKU, for operations and security node, and disk partitioning
- Minimum config - VM SKU, disk SKU, disk partitioning for NiFi nodes
- Any best practices
- Detailed documentation
Thanks in advance.
Created ‎09-28-2017 02:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There is no preferred OS for HDF, use the one that you have the most knowledge. I will say that LINUX is the most tested OS used.
There is documentation that cover OS specific tuning and best practices. The only required software is JAVA 8.
The minimum system resources will be driven by the volume of data, size of files and how much processing will be done on the data.
Here is a link to documentation that provides a good starting point and hardware sizing recommendations : Planning your deployment
Created ‎09-28-2017 02:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There is no preferred OS for HDF, use the one that you have the most knowledge. I will say that LINUX is the most tested OS used.
There is documentation that cover OS specific tuning and best practices. The only required software is JAVA 8.
The minimum system resources will be driven by the volume of data, size of files and how much processing will be done on the data.
Here is a link to documentation that provides a good starting point and hardware sizing recommendations : Planning your deployment
