Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to Load Data and Run Applications in HDFS?


How to Load Data and Run Applications in HDFS?

New Contributor

Thanks in advance for you help.


Re: How to Load Data and Run Applications in HDFS?


@Sreekanth Naidu

There are many ways to load data into HDFS and many ways to run applications in HDFS. The broad diversity of tools and variety of apps and solutions gives Hadoop great power and flexibility to store, process and analyze:

  • large volumes of data
  • a wide variety of data (structured, semistructured, unstructured)
  • a high velocity of data (streaming)

all in one centralized data store (HDFS). Doing so allows the business to better execute what they are already doing (e.g. lower cost data warehousing) and to implement new capabilities that were not before possible (e.g. applications based on predictive analytics or a 360 degree view of the customer).

As a high level answer to your question. You can load and run programs in HDFS from:

  • browser-based UI like Ambari views This is the easiest way to get started. See tutorials.
  • browser-based UI like Zeppelin
  • command line from Linux edge node (linux box with Hadoop libraries installed and networked to the cluster
  • 3rd party tools (like AtScale or Syncsort DMX-h) installed on your desktop, an edge node, or dedicated server

The best way to start understanding the components and capabilities of hadoop is

The best way to get started in becoming a Big Data expert is

Don't have an account?
Coming from Hortonworks? Activate your account here