Member since
10-25-2018
9
Posts
0
Kudos Received
0
Solutions
12-29-2018
09:22 AM
The configuration of the slave i.e. number of core and RAM available on the slave. The right number of map/node can between 10-100. Usually, 1 to 1.5 cores of processor should be given to each mapper . So for a 15 core processor, 10 mappers can run.
... View more
12-19-2018
09:34 AM
How can I create single mapper for small files
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
12-12-2018
02:09 PM
The Client can interact with the Hive in the below three ways:-
ü Hive Thrift Client: The Hive server is exposed as thrift service. Hence it is possible to interact with HIVE with any programming language that supports thrift.
ü JDBC Driver: Hive uses pure Type 4 JDBC driver to connect to the server which is defined in org.apache.hadoop.HIVE.JDBC.HiveDriver class. Pure Java applications may use this driver in order to connect to an application using separate host and port.
The BeeLine CLI uses JDBC Driver to connect to the HIVE Server.
ü ODBC Driver: An ODBC Driver allows an application that supports ODBC to connect to the HIVE server. By default, Apache does not ship the ODBC Driver but it is freely available by many vendors.
... View more
12-05-2018
11:10 AM
Setting the parameter mapred.job.reuse.jvm.num.tasks=-1 causes the TaskTracker to reuse jvms for the same job. But it's better to start a new JVM for the tasks(if the tasks are not reasonably small) for long running jobs.If we set thse to (-1)it will reused unlimited time.
... View more
11-28-2018
03:07 PM
Input Split: It’s the logical division of records which means to say it doesn’t contain any data inside but a logical reference to data. It’s only used during data processing by MapReduce . User can control the size of the InputSplit and each InputSplit is assigned to individual mappers for processing. It’s defined by the InputFormat class.
HDFS Block: It’s the physical representation of data. It contains a minimum amount of data that can be read or write. The default size of HDFS block is 128 MB which we can configure according to our requirements. All the blocks of the file are of the same size except the last block which might be of the same size or smaller. The files are divided into 128MB blocks and then stored in the file system.
... View more
11-24-2018
10:07 AM
What is a Backup Node and how it works in Hadoop and roles and responsibilities of Backup Node in Apache Hadoop?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
11-01-2018
11:40 AM
What does hadoop-metrics.properties file do?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
10-25-2018
12:28 PM
Does HDFS ensure data integrity of data blocks stored in HDFS?How?
... View more
Labels: