Support Questions
Find answers, ask questions, and share your expertise

Hadoop MapReduce - SQL query execution

Contributor

I am trying to create a Hadoop MapReduce job, which maps creates a key-value pair of all files with query to be executed and reducer function executes or applies the sql query to the input text files

Map() function should search for the files with given keyword and send files as input to the reducer (key)

Reducer() function should execute the query to the files (key - value) key-files and value-query

Map() - Input Key-Value: Keyword-Query

-- how to search for files in the specific directory?

Reducer() - Input Key-Value: Files-Query

-- how to execute the query or apply the sql query to the files?

Thanks

Sridhar

2 REPLIES 2

Re: Hadoop MapReduce - SQL query execution

Mentor

Please provide a sample dataset

Re: Hadoop MapReduce - SQL query execution

Contributor

Hi Sridahr,

As @Emil mentioned, depending on your data you could also create an external Hive table. For example:

CREATE EXTERNAL TABLE IF NOT EXISTS <table_name> (
<field_1>    STRING,
<field_2>    STRING,
<field_3>    INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '<hdfs_path_to_folder>';

Example query:

select * from <table_name> where <field_1>='<term_1>' and <field_2>='<term_2>';

Hope this helps, Chris