Support Questions

coatespt · ‎09-23-2016

My existing EBS volumes are transparently encrypted. I added an extra volume that is not encrypted. Now I want to be able to control where HDFS writes a file. I think it must be possible because heterogeneous storage policies tell HDFS where to write. How can I do this?

VR46 · ‎09-25-2016

Hi @Peter Coates

HDFS does support heterogeneous storage types but specifying your own storage type is not supported. You need to use one from pre-defined types (ARCHIVE, DISK, SSD and RAM_DISK). Each storage type comes with its own policy (which affects the way creation & replicas will be handled).

So if you can differentiate between your encrypted and non-encrypted volume based on these storage types, then only can control where HDFS writes a file.

Hope this helps.

Reference: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html

View solution in original post

VR46 · ‎09-25-2016

Hi @Peter Coates

HDFS does support heterogeneous storage types but specifying your own storage type is not supported. You need to use one from pre-defined types (ARCHIVE, DISK, SSD and RAM_DISK). Each storage type comes with its own policy (which affects the way creation & replicas will be handled).

So if you can differentiate between your encrypted and non-encrypted volume based on these storage types, then only can control where HDFS writes a file.

Hope this helps.

Reference: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html

coatespt · ‎09-25-2016

I feared as much. Thank you for your suggestion--I think it work for us, as this is a cloud cluster, and we can archive to S3, obviating the need to use heterogeneous storage for its intended purpose. However, I would like to suggest a Jira ticket to add a storage class for this purpose. There are significant use-cases where it would be useful to know that a subset of your data is confined to specific drives (a) without the restrictions of the existing policies (b) without abusing a storage class for this purpose.

Cloudera Community

Support Questions

How to write HDFS data to a specific device

MiniFi for Sensor Data Ingest from Devices as Reco...

Read/Write throughput HDFS JBOD disk

Writing parquet on HDFS using Spark Streaming

Device Behavior Analytics

Write Spark HQL Query output to HDFS

Ingesting GPS Data From Onion Omega2 Devices with ...

Reading/Writing data from HDFS via Windows.

How do you encrypt specific data fields in HDF (sp...

How to use NiFi to write API data to CDP CDW

write a file to HDFS using Spark