Support Questions
Find answers, ask questions, and share your expertise

Hive query fails - Requested replication 3 exceeds maximum 1


Hive query fails - Requested replication 3 exceeds maximum 1

Rising Star

I'm setting up a tiny pseudo-distributed cluster for testing. I have only one datanode. I works and I can load and query data in Hive. Great. Then I changed the HDFS replication factor from 3 to 1, and also changed the max replication factor to 1, too. I restarted all HDFS, YARN, and MR processes (that's all that Ambari indicated that needed restarted). Now, when I run a Hive query, I see this in the hadoop-hdfs-namenode-log

2016-05-19 15:26:03,322 INFO  ipc.Server ( - IPC Server handler 103 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.create from Call#27745 file /tmp/hive/mpetronic/_tez_session_dir/5e16aba9-d9d3-4138-afda-58c1d0027e4d/hive-hcatalog-core.jar on client replication 3 exceeds maximum 1   at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.verifyReplication(   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(   at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(   at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(   at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(   at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$   at org.apache.hadoop.ipc.RPC$   at org.apache.hadoop.ipc.Server$Handler$   at org.apache.hadoop.ipc.Server$Handler$   at Method)   at   at   at org.apache.hadoop.ipc.Server$

And my query fails with this...

0: jdbc:hive2://mpws:10000/default> select count(device_id) from vsat_modc_batch where device_id like 'DSS%' limit 10;
INFO  : Tez session hasn't been created yet. Opening session
ERROR : Failed to execute tez graph. Previous writer likely failed to write hdfs://mpws:8020/tmp/hive/mpetronic/_tez_session_dir/9216a3de-e9fc-4940-a4b3-21e49bcca4b5/hive-hcatalog-core.jar. Failing because I am unlikely to write too.
    at org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(
    at org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(
    at org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(
    at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.refreshLocalResourcesFromConf(
    at org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(
    at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(
    at org.apache.hadoop.hive.ql.Driver.launchTask(
    at org.apache.hadoop.hive.ql.Driver.execute(
    at org.apache.hadoop.hive.ql.Driver.runInternal(
    at org.apache.hive.service.cli.operation.SQLOperation.runQuery(
    at org.apache.hive.service.cli.operation.SQLOperation.access$100(
    at org.apache.hive.service.cli.operation.SQLOperation$1$
    at Method)
    at org.apache.hive.service.cli.operation.SQLOperation$
    at java.util.concurrent.Executors$
    at java.util.concurrent.ThreadPoolExecutor.runWorker(
    at java.util.concurrent.ThreadPoolExecutor$
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask (state=08S01,code=1)

So, it seems that something did not get the memo about the reduced replication factor. Did I miss some additional configuration needed to run at replication factor = 1? I reconfigured back to 3 and everything works again. I could not find anything indicating I need to configure something in the hive or tez clients that is related to replication.


Re: Hive query fails - Requested replication 3 exceeds maximum 1

The issues might be related to the missing blocks. Verify the block report and for missing blocks delete or upload the files.