Support Questions
Find answers, ask questions, and share your expertise

For ORC File what determines the number of mappers?

Solved Go to solution

For ORC File what determines the number of mappers?

New Contributor

For Orc file, how does yarn determine number of mapper. Is this based on the files in the hdfs?

1 ACCEPTED SOLUTION

Accepted Solutions

Re: For ORC File what determines the number of mappers?

Rising Star

@Aron,

Initially getsplits method splits the data based on the blocks in HDFS.But it was changed so that splitting is based on stripes of orc file.

https://issues.apache.org/jira/browse/HIVE-5102

The above link provides the complete link for details and source code for OrcInputformat and getSplit Method

View solution in original post

3 REPLIES 3

Re: For ORC File what determines the number of mappers?

Mentor

OrcInputFormat is an implementation of InputFormat interface, where method getSplits determines the number of mappers https://hive.apache.org/javadocs/r0.13.1/api/ql/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.html.

Re: For ORC File what determines the number of mappers?

Rising Star

@Aron,

Initially getsplits method splits the data based on the blocks in HDFS.But it was changed so that splitting is based on stripes of orc file.

https://issues.apache.org/jira/browse/HIVE-5102

The above link provides the complete link for details and source code for OrcInputformat and getSplit Method

View solution in original post

Re: For ORC File what determines the number of mappers?

New Contributor