Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Pig/ Mapreduce not connected while writing into Hive

avatar
Cloudera Employee

Pig -useHCatalog

grunt> A = load '/user/horton/text.txt' using PigStorage(',');

grunt> store A into 'default.exchangedata' using org.apache.hive.hcatalog.pig.HCatStorer();

2018-07-19 05:16:04,535 [main] WARNorg.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.stats.fetch.partition.stats does not exist

2018-07-19 05:16:04,536 [main] WARNorg.apache.hadoop.hive.conf.HiveConf - HiveConf of name hive.heapsize does not exist

2018-07-19 05:16:04,536 [main] INFOorg.apache.hive.hcatalog.common.HCatUtil - mapreduce.lib.hcatoutput.hive.conf not set. Generating configuration differences.

2018-07-19 05:16:04,537 [main] INFOorg.apache.hive.hcatalog.common.HCatUtil - Configuration differences={hive.txn.max.open.batch=1000, credentialStoreClassPath=/var/lib/ambari-agent/cred/lib/*, hive.auto.convert.sortmerge.join.to.mapjoin=true, hive.server2.thrift.sasl.qop=auth, hive.auto.convert.join.noconditionaltask.size=2004318071, hive.cluster.delegation.token.store.zookeeper.connectString=hiost:2181, hive.compactor.worker.threads=2, hive.cluster.delegation.token.store.zookeeper.znode=/hive/cluster/delegation, hive.tez.dynamic.partition.pruning=true, hive.hook.proto.base-directory=/warehouse/tablespace/external/hive/sys.db/query_data/, hive.stats.dbclass=fs, hive.server2.logging.operation.enabled=true, hive.prewarm.enabled=false, hive.vectorized.groupby.maxentries=100000, hive.compactor.abortedtxn.threshold=1000, hive.merge.smallfiles.avgsize=16000000, hive.metastore.transactional.event.listeners=org.apache.hive.hcatalog.listener.DbNotificationListener, hive.server2.logging.operation.log.location=/tmp/hive/operation_logs, hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory, hive.security.metastore.authorization.auth.reads=true, hive.server2.tez.initialize.default.sessions=false, hive.server2.webui.cors.allowed.headers=X-Requested-With,Content-Type,Accept,Origin,X-Requested-By,x-requested-by, hive.optimize.sort.dynamic.partition=false, hive.optimize.bucketmapjoin=true, ambari.hive.db.schema.name=hive, hive.tez.min.partition.factor=0.25, hive.tez.input.generate.consistent.splits=true, hive.metastore.authorization.storage.checks=false, hive.zookeeper.namespace=hive_zookeeper_namespace, hive.merge.rcfile.block.level=true, hive.metastore.uris=thrift://host:9083, hive.compactor.worker.timeout=86400, hive.exec.compress.intermediate=false, datanucleus.cache.level2.type=none, hive.server2.thrift.http.path=cliservice, hive.limit.optimize.enable=true, hive.server2.thrift.http.port=10001, hive.zookeeper.client.port=2181, javax.jdo.option.ConnectionUserName=hive, hive.tez.exec.print.summary=true, hive.server2.thrift.port=10000, hive.merge.mapfiles=true, hive.heapsize=1024, hive.enforce.sortmergebucketmapjoin=true, hive.security.metastore.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider, hive.metastore.warehouse.external.dir=/warehouse/tablespace/external/hive, hive.auto.convert.sortmerge.join=true, hive.server2.thrift.max.worker.threads=500, hive.metastore.failure.retries=24, hive.map.aggr.hash.min.reduction=0.5, hive.stats.fetch.partition.stats=true, hive.orc.splits.include.file.footer=false, hive.execution.engine=tez, hive.exec.compress.output=false, hive.metastore.kerberos.keytab.file=/etc/security/keytabs/hive.service.keytab, hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.HiveProtoLoggingHook, hive.optimize.metadataonly=true, hive.compactor.delta.pct.threshold=0.1f, hive.server2.tez.sessions.per.default.queue=1, hive.exec.parallel.thread.number=8, hive.tez.dynamic.partition.pruning.max.data.size=104857600, hive.auto.convert.join.noconditionaltask=true, hive.exec.max.created.files=100000, hive.compactor.delta.num.threshold=10, hive.materializedview.rewriting.incremental=false, hive.exec.pre.hooks=org.apache.hadoop.hive.ql.hooks.HiveProtoLoggingHook, hive.smbjoin.cache.rows=10000, hive.server2.authentication=NONE, hive.optimize.index.filter=true, hive.driver.parallel.compilation=true, hive.optimize.null.scan=true, hive.tez.bucket.pruning=true, hive.server2.tez.default.queues=default, hive.server2.allow.user.substitution=true, hive.metastore.client.connect.retry.delay=5s, hive.vectorized.execution.enabled=true, hive.metastore.warehouse.dir=/warehouse/tablespace/managed/hive, hive.auto.convert.join=true, hive.mapjoin.bucket.cache.size=10000, hive.conf.restricted.list=hive.security.authenticator.manager,hive.security.authorization.manager,hive.users.in.admin.role, hive.mapjoin.hybridgrace.hashtable=false, hive.server2.webui.port=10002, hive.tez.smb.number.waves=0.5, hive.server2.enable.doAs=true, atlas.hook.hive.minThreads=1, hive.txn.timeout=300, hive.tez.cartesian-product.enabled=true, hive.metastore.pre.event.listeners=org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener, hive.vectorized.execution.reduce.enabled=true, hadoop.security.credential.provider.path=jceks://file/usr/hdp/current/hive-server2/conf/hive-site.jceks, javax.jdo.option.ConnectionDriverName=com.mysql.jdbc.Driver, hive.security.authenticator.manager=org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator, hive.vectorized.groupby.checkinterval=4096, hive.compute.query.using.stats=true, hive.default.fileformat=TextFile, hive.exec.scratchdir=/tmp/hive, hive.strict.managed.tables=true, hive.exec.max.dynamic.partitions.pernode=2000, hive.vectorized.execution.mapjoin.native.fast.hashtable.enabled=true, hive.exec.dynamic.partition=true, hive.cluster.delegation.token.store.class=org.apache.hadoop.hive.thrift.ZooKeeperTokenStore, hive.metastore.client.socket.timeout=1800s, hive.map.aggr.hash.force.flush.memory.threshold=0.9, hive.vectorized.execution.mapjoin.minmax.enabled=true, hive.security.metastore.authenticator.manager=org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator, hive.server2.idle.operation.timeout=6h, hive.tez.cpu.vcores=-1, hive.vectorized.execution.mapjoin.native.enabled=true, hive.server2.webui.enable.cors=true, hive.tez.log.level=INFO, hive.cli.print.header=false, hive.fetch.task.aggr=false, hive.orc.compute.splits.num.threads=10, hive.limit.pushdown.memory.usage=0.04, hive.tez.container.size=7168, hive.server2.webui.use.ssl=false, hive.metastore.server.max.threads=100000, hive.server2.zookeeper.namespace=hiveserver2, hive.server2.authentication.spnego.principal=/etc/security/keytabs/spnego.service.keytab, javax.jdo.option.ConnectionURL=jdbc:mysql://caldomain/hive?createDatabaseIfNotExist=true, hive.metastore.kerberos.principal=hive/_HOST@EXAMPLE.COM, hive.exec.parallel=false, hive.merge.size.per.task=256000000, hive.execution.mode=container, hive.exec.reducers.max=1009, hive.fetch.task.conversion.threshold=1073741824, hive.metastore.execute.setugi=true, hive.load.data.owner=hive, hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager, hive.metastore.cache.pinobjtypes=Table,Database,Type,FieldSchema,Order, hive.tez.dynamic.partition.pruning.max.event.size=1048576, hive.metastore.dml.events=true, hive.tez.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat, hive.merge.mapredfiles=false, hive.server2.max.start.attempts=5, hive.metastore.sasl.enabled=false, hive.metastore.connect.retries=24, hive.zookeeper.quorum=cocaldomain:2181, hive.optimize.constant.propagation=true, hive.optimize.reducededuplication=true, hive.fetch.task.conversion=more, atlas.hook.hive.maxThreads=1, hive.map.aggr.hash.percentmemory=0.5, hive.mapjoin.optimized.hashtable=true, hive.stats.fetch.column.stats=true, hive.tez.auto.reducer.parallelism=true, hive.exec.max.dynamic.partitions=5000, hive.server2.table.type.mapping=CLASSIC, hive.vectorized.groupby.flush.percent=0.1, hive.exec.submitviachild=false, hive.merge.orcfile.stripe.level=true, hive.exec.dynamic.partition.mode=nonstrict, hive.metastore.db.type=MYSQL, hive.merge.tezfiles=false, hive.compactor.initiator.on=true, datanucleus.autoCreateSchema=false, hive.server2.support.dynamic.service.discovery=true, hive.convert.join.bucket.mapjoin.tez=false, hive.prewarm.numcontainers=3, hive.stats.autogather=true, hive.map.aggr=true, hive.exec.failure.hooks=org.apache.hadoop.hive.ql.hooks.HiveProtoLoggingHook, hive.service.metrics.codahale.reporter.classes=org.apache.hadoop.hive.common.metrics.metrics2.JsonFileMetricsReporter,org.apache.hadoop.hive.common.metrics.metrics2.JmxMetricsReporter,org.apache.hadoop.hive.common.metrics.metrics2.Metrics2Reporter, hive.tez.max.partition.factor=2.0, hive.server2.transport.mode=binary, hive.optimize.reducededuplication.min.reducer=4, hive.server2.authentication.spnego.keytab=HTTP/_HOST@EXAMPLE.COM, hive.optimize.bucketmapjoin.sortedmerge=false, hive.support.concurrency=true, hive.mapred.reduce.tasks.speculative.execution=false, hive.default.fileformat.managed=ORC, hive.server2.use.SSL=false, metastore.create.as.acid=true, hive.exec.orc.split.strategy=HYBRID, hive.cbo.enable=true, datanucleus.fixedDatastore=true, hive.exec.reducers.bytes.per.reducer=67108864, hive.compactor.check.interval=300, hive.create.as.insert.only=true, hive.exec.submit.local.task.via.child=true, hive.server2.idle.session.timeout=1d, hive.txn.strict.locking.mode=false, hive.optimize.dynamic.partition.hashjoin=true, hive.security.authorization.enabled=false, hive.tez.java.opts=-server -Djava.net.preferIPv4Stack=true -XX:NewRatio=8 -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB -XX:+PrintGCDetails -verbose:gc -XX:+PrintGCTimeStamps, hive.user.install.directory=/user/}

2018-07-19 05:16:04,540 [main] INFOorg.apache.hadoop.hive.metastore.HiveMetaStoreClient - Trying to connect to metastore with URI thrift://localdomain:9083

2018-07-19 05:16:04,540 [main] INFOorg.apache.hadoop.hive.metastore.HiveMetaStoreClient - Opened a connection to metastore, current connections: 1

2018-07-19 05:16:04,541 [main] INFOorg.apache.hadoop.hive.metastore.HiveMetaStoreClient - Connected to metastore.

2018-07-19 05:16:04,541 [main] INFOorg.apache.hadoop.hive.metastore.RetryingMetaStoreClient - RetryingMetaStoreClient proxy=class org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient ugi=horton (auth:SIMPLE) retries=24 delay=5 lifetime=0

2018-07-19 05:16:04,557 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1115: org.apache.hive.hcatalog.common.HCatException : 2016 : Error operation not supported : Store into a transactional table from Pig/Mapreduce is not supported

Details at logfile: /tmp/hsperfdata_horton/pig_1531977029670.log

grunt>

1 ACCEPTED SOLUTION

avatar

@rganeshbabu

Inserting the data into ACID table or to a bucketed table from Pig is not supported, hence the error is seen.

Workaround:

1. Load the data into non-transactional table.

2. From Hive client, load the data from non-transactional table into transactional table.

insert into acid_table select * from non_acid_table;

View solution in original post

4 REPLIES 4

avatar

@rganeshbabu

Inserting the data into ACID table or to a bucketed table from Pig is not supported, hence the error is seen.

Workaround:

1. Load the data into non-transactional table.

2. From Hive client, load the data from non-transactional table into transactional table.

insert into acid_table select * from non_acid_table;

avatar
Cloudera Employee

I have created external table with tbl properties where trtransactional'='false' and tried inserting through Pig. It worked.

Note: I am not able to create hive managed table with transactional property is equal to true

Error:

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Table default.exchangedata failed strict managed table checks due to the following reason: Table is marked as a managed table but is not transactional.) (state=08S01,code=1)

avatar

@rganeshbabu

Can you share the DDL of transactional table?

avatar
Cloudera Employee

CREATE TABLE `exchangedata`(

`day` string,

`euro` float,

`yen` float,

`dollar` float)

ROW FORMAT SERDE

'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'|

WITH SERDEPROPERTIES (

'field.delim'=',',

'serialization.format'=',')

STORED AS INPUTFORMAT

'org.apache.hadoop.mapred.TextInputFormat'

OUTPUTFORMAT

'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'

TBLPROPERTIES (

'transactional'='false');