Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Pig/ Mapreduce not connected while writing into Hive

avatar
Cloudera Employee

Pig -useHCatalog

grunt> A = load '/user/horton/text.txt' using PigStorage(',');

grunt> store A into 'default.exchangedata' using org.apache.hive.hcatalog.pig.HCatStorer();

2018-07-19 05:16:04,535 [main] WARNorg.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.fetch.partition.stats does not exist

2018-07-19 05:16:04,536 [main] WARNorg.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.heapsize does not exist

2018-07-19 05:16:04,536 [main] INFOorg.apache.hive.hcatalog.common.HCatUtil - mapreduce.lib.hcatoutput.hive.conf not set. Generating configuration differences.

2018-07-19 05:16:04,537 [main] INFOorg.apache.hive.hcatalog.common.HCatUtil - Configuration differences={hive.txn.max.open.batch=1000, credentialStoreClassPath=/var/lib/ambari-agent/cred/lib/*, hive.auto.convert.sortmerge.join.to.mapjoin=true, hive.server2.thrift.sasl.qop=auth, hive.auto.convert.join.noconditionaltask.size=2004318071, hive.cluster.delegation.token.store.zookeeper.connectString=hiost:2181, hive.compactor.worker.threads=2, hive.cluster.delegation.token.store.zookeeper.znode=/hive/cluster/delegation, hive.tez.dynamic.partition.pruning=true, hive.hook.proto.base-directory=/warehouse/tablespace/external/hive/sys.db/query_data/, hive.stats.dbclass=fs, hive.server2.logging.operation.enabled=true, hive.prewarm.enabled=false, hive.vectorized.groupby.maxentries=100000, hive.compactor.abortedtxn.threshold=1000, hive.merge.smallfiles.avgsize=16000000, hive.metastore.transactional.event.listeners=org.apache.hive.hcatalog.listener.DbNotificationListener, hive.server2.logging.operation.log.location=/tmp/hive/operation_logs, hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory, hive.security.metastore.authorization.auth.reads=true, hive.server2.tez.initialize.default.sessions=false, hive.server2.webui.cors.allowed.headers=X-Requested-With,Content-Type,Accept,Origin,X-Requested-By,x-requested-by, hive.optimize.sort.dynamic.partition=false, hive.optimize.bucketmapjoin=true, ambari.hive.db.schema.name=hive, hive.tez.min.partition.factor=0.25, hive.tez.input.generate.consistent.splits=true, hive.metastore.authorization.storage.checks=false, hive.zookeeper.namespace=hive_zookeeper_namespace, hive.merge.rcfile.block.level=true, hive.metastore.uris=thrift://host:9083, hive.compactor.worker.timeout=86400, hive.exec.compress.intermediate=false, datanucleus.cache.level2.type=none, hive.server2.thrift.http.path=cliservice, hive.limit.optimize.enable=true, hive.server2.thrift.http.port=10001, hive.zookeeper.client.port=2181, javax.jdo.option.ConnectionUserName=hive, hive.tez.exec.print.summary=true, hive.server2.thrift.port=10000, hive.merge.mapfiles=true, hive.heapsize=1024, hive.enforce.sortmergebucketmapjoin=true, hive.security.metastore.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider, hive.metastore.warehouse.external.dir=/warehouse/tablespace/external/hive, hive.auto.convert.sortmerge.join=true, hive.server2.thrift.max.worker.threads=500, hive.metastore.failure.retries=24, hive.map.aggr.hash.min.reduction=0.5, hive.stats.fetch.partition.stats=true, hive.orc.splits.include.file.footer=false, hive.execution.engine=tez, hive.exec.compress.output=false, hive.metastore.kerberos.keytab.file=/etc/security/keytabs/hive.service.keytab, hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.HiveProtoLoggingHook, hive.optimize.metadataonly=true, hive.compactor.delta.pct.threshold=0.1f, hive.server2.tez.sessions.per.default.queue=1, hive.exec.parallel.thread.number=8, hive.tez.dynamic.partition.pruning.max.data.size=104857600, hive.auto.convert.join.noconditionaltask=true, hive.exec.max.created.files=100000, hive.compactor.delta.num.threshold=10, hive.materializedview.rewriting.incremental=false, hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.HiveProtoLoggingHook, hive.smbjoin.cache.rows=10000, hive.server2.authentication=NONE, hive.optimize.index.filter=true, hive.driver.parallel.compilation=true, hive.optimize.null.scan=true, hive.tez.bucket.pruning=true, hive.server2.tez.default.queues=default, hive.server2.allow.user.substitution=true, hive.metastore.client.connect.retry.delay=5s, hive.vectorized.execution.enabled=true, hive.metastore.warehouse.dir=/warehouse/tablespace/managed/hive, hive.auto.convert.join=true, hive.mapjoin.bucket.cache.size=10000, hive.conf.restricted.list=hive.security.authenticator.manager,hive.security.authorization.manager,hive.users.in.admin.role, hive.mapjoin.hybridgrace.hashtable=false, hive.server2.webui.port=10002, hive.tez.smb.number.waves=0.5, hive.server2.enable.doAs=true, atlas.hook.hive.minThreads=1, hive.txn.timeout=300, hive.tez.cartesian-product.enabled=true, hive.metastore.pre.event.listeners=org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener, hive.vectorized.execution.reduce.enabled=true, hadoop.security.credential.provider.path=jceks://file/usr/hdp/current/hive-server2/conf/hive-site.jceks, javax.jdo.option.ConnectionDriverName=com.mysql.jdbc.Driver, hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator, hive.vectorized.groupby.checkinterval=4096, hive.compute.query.using.stats=true, hive.default.fileformat=TextFile, hive.exec.scratchdir=/tmp/hive, hive.strict.managed.tables=true, hive.exec.max.dynamic.partitions.pernode=2000, hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled=true, hive.exec.dynamic.partition=true, hive.cluster.delegation.token.store.class=org.apache.hadoop.hive.thrift.ZooKeeperTokenStore, hive.metastore.client.socket.timeout=1800s, hive.map.aggr.hash.force.flush.memory.threshold=0.9, hive.vectorized.execution.mapjoin.minmax.enabled=true, hive.security.metastore.authenticator.manager=org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator, hive.server2.idle.operation.timeout=6h, hive.tez.cpu.vcores=-1, hive.vectorized.execution.mapjoin.native.enabled=true, hive.server2.webui.enable.cors=true, hive.tez.log.level=INFO, hive.cli.print.header=false, hive.fetch.task.aggr=false, hive.orc.compute.splits.num.threads=10, hive.limit.pushdown.memory.usage=0.04, hive.tez.container.size=7168, hive.server2.webui.use.ssl=false, hive.metastore.server.max.threads=100000, hive.server2.zookeeper.namespace=hiveserver2, hive.server2.authentication.spnego.principal=/etc/security/keytabs/spnego.service.keytab, javax.jdo.option.ConnectionURL=jdbc:mysql://caldomain/hive?createDatabaseIfNotExist=true, hive.metastore.kerberos.principal=hive/_HOST@EXAMPLE.COM, hive.exec.parallel=false, hive.merge.size.per.task=256000000, hive.execution.mode=container, hive.exec.reducers.max=1009, hive.fetch.task.conversion.threshold=1073741824, hive.metastore.execute.setugi=true, hive.load.data.owner=hive, hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager, hive.metastore.cache.pinobjtypes=Table,Database,Type,FieldSchema,Order, hive.tez.dynamic.partition.pruning.max.event.size=1048576, hive.metastore.dml.events=true, hive.tez.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat, hive.merge.mapredfiles=false, hive.server2.max.start.attempts=5, hive.metastore.sasl.enabled=false, hive.metastore.connect.retries=24, hive.zookeeper.quorum=cocaldomain:2181, hive.optimize.constant.propagation=true, hive.optimize.reducededuplication=true, hive.fetch.task.conversion=more, atlas.hook.hive.maxThreads=1, hive.map.aggr.hash.percentmemory=0.5, hive.mapjoin.optimized.hashtable=true, hive.stats.fetch.column.stats=true, hive.tez.auto.reducer.parallelism=true, hive.exec.max.dynamic.partitions=5000, hive.server2.table.type.mapping=CLASSIC, hive.vectorized.groupby.flush.percent=0.1, hive.exec.submitviachild=false, hive.merge.orcfile.stripe.level=true, hive.exec.dynamic.partition.mode=nonstrict, hive.metastore.db.type=MYSQL, hive.merge.tezfiles=false, hive.compactor.initiator.on=true, datanucleus.autoCreateSchema=false, hive.server2.support.dynamic.service.discovery=true, hive.convert.join.bucket.mapjoin.tez=false, hive.prewarm.numcontainers=3, hive.stats.autogather=true, hive.map.aggr=true, hive.exec.failure.hooks=org.apache.hadoop.hive.ql.hooks.HiveProtoLoggingHook, hive.service.metrics.codahale.reporter.classes=org.apache.hadoop.hive.common.metrics.metrics2.JsonFileMetricsReporter,org.apache.hadoop.hive.common.metrics.metrics2.JmxMetricsReporter,org.apache.hadoop.hive.common.metrics.metrics2.Metrics2Reporter, hive.tez.max.partition.factor=2.0, hive.server2.transport.mode=binary, hive.optimize.reducededuplication.min.reducer=4, hive.server2.authentication.spnego.keytab=HTTP/_HOST@EXAMPLE.COM, hive.optimize.bucketmapjoin.sortedmerge=false, hive.support.concurrency=true, hive.mapred.reduce.tasks.speculative.execution=false, hive.default.fileformat.managed=ORC, hive.server2.use.SSL=false, metastore.create.as.acid=true, hive.exec.orc.split.strategy=HYBRID, hive.cbo.enable=true, datanucleus.fixedDatastore=true, hive.exec.reducers.bytes.per.reducer=67108864, hive.compactor.check.interval=300, hive.create.as.insert.only=true, hive.exec.submit.local.task.via.child=true, hive.server2.idle.session.timeout=1d, hive.txn.strict.locking.mode=false, hive.optimize.dynamic.partition.hashjoin=true, hive.security.authorization.enabled=false, hive.tez.java.opts=-server -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps, hive.user.install.directory=/user/}

2018-07-19 05:16:04,540 [main] INFOorg.apache.hadoop.hive.metastore.HiveMetaStoreClient - Trying to connect to metastore with URI thrift://localdomain:9083

2018-07-19 05:16:04,540 [main] INFOorg.apache.hadoop.hive.metastore.HiveMetaStoreClient - Opened a connection to metastore, current connections: 1

2018-07-19 05:16:04,541 [main] INFOorg.apache.hadoop.hive.metastore.HiveMetaStoreClient - Connected to metastore.

2018-07-19 05:16:04,541 [main] INFOorg.apache.hadoop.hive.metastore.RetryingMetaStoreClient - RetryingMetaStoreClient proxy=class org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient ugi=horton (auth:SIMPLE) retries=24 delay=5 lifetime=0

2018-07-19 05:16:04,557 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1115: org.apache.hive.hcatalog.common.HCatException : 2016 : Error operation not supported : Store into a transactional table from Pig/Mapreduce is not supported

Details at logfile: /tmp/hsperfdata_horton/pig_1531977029670.log

grunt>

1 ACCEPTED SOLUTION

avatar

@rganeshbabu

Inserting the data into ACID table or to a bucketed table from Pig is not supported, hence the error is seen.

Workaround:

1. Load the data into non-transactional table.

2. From Hive client, load the data from non-transactional table into transactional table.

insert into acid_table select * from non_acid_table;

View solution in original post

4 REPLIES 4

avatar

@rganeshbabu

Inserting the data into ACID table or to a bucketed table from Pig is not supported, hence the error is seen.

Workaround:

1. Load the data into non-transactional table.

2. From Hive client, load the data from non-transactional table into transactional table.

insert into acid_table select * from non_acid_table;

avatar
Cloudera Employee

I have created external table with tbl properties where trtransactional'='false' and tried inserting through Pig. It worked.

Note: I am not able to create hive managed table with transactional property is equal to true

Error:

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table default.exchangedata failed strict managed table checks due to the following reason: Table is marked as a managed table but is not transactional.) (state=08S01,code=1)

avatar

@rganeshbabu

Can you share the DDL of transactional table?

avatar
Cloudera Employee

CREATE TABLE `exchangedata`(

`day` string,

`euro` float,

`yen` float,

`dollar` float)

ROW FORMAT SERDE

'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'|

WITH SERDEPROPERTIES (

'field.delim'=',',

'serialization.format'=',')

STORED AS INPUTFORMAT

'org.apache.hadoop.mapred.TextInputFormat'

OUTPUTFORMAT

'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'

TBLPROPERTIES (

'transactional'='false');