About Ane

Ane · ‎06-07-2022

I am getting following error when I run the tez query(as hdfs user). INFO : Cleaning up the staging area file:/tmp/hadoop/mapred/staging/hdfs1254373830/.staging/job_local1254373830_0002 ERROR : Job Submission failed with exception 'org.apache.hadoop.util.DiskChecker$DiskErrorException(No space available in any of the local directories.)' org.apache.hadoop.util.DiskChecker$DiskErrorException: No space available in any of the local directories. at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:416) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:130) at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:123) at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:172) at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:794) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:251) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:423) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224) at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:316) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:330) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. No space available in any of the local directories. bash-4.2$ls -lh /tmp lrwxrwxrwx 1 root root 8 Jun 2 13:20 /tmp -> /mnt/tmp /mnt has enough space: /dev/dev1 195G 3.7G 192G 2% /mnt I get error when creating directory /tmp: bash-4.2$ mkdir /tmp/hadoop/mapred/staging/1 mkdir: cannot create directory '/tmp/hadoop/mapred/staging/1': Permission denied

Ane · ‎05-07-2022

I have a multiple mr/spark job that uses regular expression to filter input files for mr/spark job. Is there any way I can get the files processed by the mr/spark application after the job is completed?

Ane · ‎03-17-2022

I am getting following error while running the tpch query I saw the following link related to error (https://issues.apache.org/jira/browse/HIVE-11427) but not sure how to use this. I am using: Hive 3.1.3000.7.1.6.0-297 Error: Error while compiling statement: FAILED: SemanticException 0:0 Error creating temporary folder on: hdfs://testmach:8020/apps/hive/warehouse/tpch_flat_orc_2.db. Error encountered near token 'TOK_TMP_FILE' (state=42000,code=40000) 0: jdbc:hive2://testmach:10000> use tpch_flat_orc_120; INFO : Compiling command(queryId=hive_20220317154641_5560d6bf-4866-48b0-a990-9c5ea44f6dbd): use tpch_flat_orc_120 INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null) INFO : Completed compiling command(queryId=hive_20220317154641_5560d6bf-4866-48b0-a990-9c5ea44f6dbd); Time taken: 0.017 seconds INFO : Executing command(queryId=hive_20220317154641_5560d6bf-4866-48b0-a990-9c5ea44f6dbd): use tpch_flat_orc_120 INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20220317154641_5560d6bf-4866-48b0-a990-9c5ea44f6dbd); Time taken: 0.015 seconds INFO : OK No rows affected (0.049 seconds) 0: jdbc:hive2://testmach:10000> 0: jdbc:hive2://testmach:10000> 0: jdbc:hive2://testmach:10000> Beeline version 3.1.3000.7.1.6.0-297 by Apache Hive 0: jdbc:hive2://testmach:10000> use tpch_flat_orc_2; INFO : Compiling command(queryId=hive_20220317154646_e6389298-4543-45cf-bfbd-7cac999f299c): use tpch_flat_orc_2 INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null) INFO : Completed compiling command(queryId=hive_20220317154646_e6389298-4543-45cf-bfbd-7cac999f299c); Time taken: 0.008 seconds INFO : Executing command(queryId=hive_20220317154646_e6389298-4543-45cf-bfbd-7cac999f299c): use tpch_flat_orc_2 INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20220317154646_e6389298-4543-45cf-bfbd-7cac999f299c); Time taken: 0.01 seconds INFO : OK No rows affected (0.034 seconds) 0: jdbc:hive2://testmach:10000> source tpch_query18.sql; 22/03/17 15:47:09 [main]: WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist INFO : Compiling command(queryId=hive_20220317154709_026bc1c2-6bf0-4bd3-8c63-a48d50ea3dba): drop view q18_tmp_cached INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null) INFO : Completed compiling command(queryId=hive_20220317154709_026bc1c2-6bf0-4bd3-8c63-a48d50ea3dba); Time taken: 0.035 seconds INFO : Executing command(queryId=hive_20220317154709_026bc1c2-6bf0-4bd3-8c63-a48d50ea3dba): drop view q18_tmp_cached INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20220317154709_026bc1c2-6bf0-4bd3-8c63-a48d50ea3dba); Time taken: 0.07 seconds INFO : OK No rows affected (0.127 seconds) INFO : Compiling command(queryId=hive_20220317154709_9128d000-bb7e-43c8-876b-074f753c980a): drop table q18_large_volume_customer_cached INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null) INFO : Completed compiling command(queryId=hive_20220317154709_9128d000-bb7e-43c8-876b-074f753c980a); Time taken: 0.018 seconds INFO : Executing command(queryId=hive_20220317154709_9128d000-bb7e-43c8-876b-074f753c980a): drop table q18_large_volume_customer_cached INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=hive_20220317154709_9128d000-bb7e-43c8-876b-074f753c980a); Time taken: 0.012 seconds INFO : OK No rows affected (0.049 seconds) INFO : Compiling command(queryId=hive_20220317154709_a234bdff-3fb7-4cb2-a808-17d30baf1a20): create view q18_tmp_cached as select l_orderkey, sum(l_quantity) as t_sum_quantity from lineitem where l_orderkey is not null group by l_orderkey INFO : Semantic Analysis Completed (retrial = false) INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:l_orderkey, type:bigint, comment:null), FieldSchema(name:t_sum_quantity, type:double, comment:null)], properties:null) INFO : Completed compiling command(queryId=hive_20220317154709_a234bdff-3fb7-4cb2-a808-17d30baf1a20); Time taken: 0.08 seconds INFO : Executing command(queryId=hive_20220317154709_a234bdff-3fb7-4cb2-a808-17d30baf1a20): create view q18_tmp_cached as select l_orderkey, sum(l_quantity) as t_sum_quantity from lineitem where l_orderkey is not null group by l_orderkey INFO : Starting task [Stage-1:DDL] in serial mode INFO : Completed executing command(queryId=hive_20220317154709_a234bdff-3fb7-4cb2-a808-17d30baf1a20); Time taken: 0.033 seconds INFO : OK No rows affected (0.153 seconds) Error: Error while compiling statement: FAILED: SemanticException 0:0 Error creating temporary folder on: hdfs://testmach:8020/apps/hive/warehouse/tpch_flat_orc_2.db. Error encountered near token 'TOK_TMP_FILE' (state=42000,code=40000) 0: jdbc:hive2://testmach:10000>

Ane · ‎02-24-2022

hive has --database option to select database. --database <databasename> Specify the database to use What is the beeline equivalent of the same?

Ane · ‎02-24-2022

-d option is not supported in beeline. Issue got resolved using --hivevar

Ane · ‎02-24-2022

It's --hivevar

Ane · ‎02-24-2022

What is the beeline equivalent of hive cli -d option? -d,–define <key=value> Variable substitution to apply to Hive commands.

Ane · ‎02-24-2022

I am following the instructions to run hive tpch https://github.com/hortonworks/hive-testbench.git I am using Cloudera Enterprise Trial 7.3.1 I solved the initial error(https://community.cloudera.com/t5/Support-Questions/Error-while-running-hive-tpch-Exception-in-thread-quot-main/m-p/336879/thread-id/232418/highlight/false#M232465 ) and tried to rerun the tpch-setup.sh. I am running into following error. + echo 'TPC-H text data generation complete.' TPC-H text data generation complete. + echo 'Loading text data into external tables.' Loading text data into external tables. + runcommand 'beeline -u jdbc:hive2://techmach.com:10000/default -i settings/load-flat.sql -f ddl-tpch/bin_flat/alltables.sql -d DB=tpch_text_2 -d LOCATION=/tmp/tpch-generate/2' + '[' X '!=' X ']' + beeline -u jdbc:hive2://techmach.com:10000/default -i settings/load-flat.sql -f ddl-tpch/bin_flat/alltables.sql -d DB=tpch_text_2 -d LOCATION=/tmp/tpch-generate/2 + i=1 + total=8 + test 2 -le 1000 + SCHEMA_TYPE=flat + DATABASE=tpch_flat_orc_2 + MAX_REDUCERS=2600 ++ test 2 -gt 2600 ++ echo 2 + REDUCERS=2 + for t in '${TABLES}' + echo 'Optimizing table part (1/8).' Optimizing table part (1/8). + COMMAND='beeline -u jdbc:hive2://techmach.com:10000/default -i settings/load-flat.sql -f ddl-tpch/bin_flat/part.sql -d DB=tpch_flat_orc_2 -d SOURCE=tpch_text_2 -d BUCKETS=13 -d SCALE=2 -d REDUCERS=2 -d FILE=orc' + runcommand 'beeline -u jdbc:hive2://techmach.com:10000/default -i settings/load-flat.sql -f ddl-tpch/bin_flat/part.sql -d DB=tpch_flat_orc_2 -d SOURCE=tpch_text_2 -d BUCKETS=13 -d SCALE=2 -d REDUCERS=2 -d FILE=orc' + '[' X '!=' X ']' + beeline -u jdbc:hive2://techmach.com:10000/default -i settings/load-flat.sql -f ddl-tpch/bin_flat/part.sql -d DB=tpch_flat_orc_2 -d SOURCE=tpch_text_2 -d BUCKETS=13 -d SCALE=2 -d REDUCERS=2 -d FILE=orc + '[' 1 -ne 0 ']' + echo 'Command failed, try '\''export DEBUG_SCRIPT=ON'\'' and re-running' Command failed, try 'export DEBUG_SCRIPT=ON' and re-running + exit 1 Below is the failing command output when run from the console beeline -u jdbc:hive2://testmach.com:10000/default -i settings/load-flat.sql -f ddl-tpch/bin_flat/part.sql -d DB=tpch_flat_orc_2 -d SOURCE=tpch_text_2 -d BUCKETS=13 -d SCALE=2 -d REDUCERS=2 -d FILE=orc SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/jars/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging. WARNING: Use "yarn jar" to launch YARN applications. SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/jars/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Connecting to jdbc:hive2://testmach.com:10000/default DB=tpch_flat_orc_2 Running init script settings/load-flat.sql 0: jdbc:hive2://testmach.com:10000/d (closed)> --set hive.enforce.bucketing=true; 0: jdbc:hive2://testmach.com:10000/d (closed)> --set hive.enforce.sorting=true; 0: jdbc:hive2://testmach.com:10000/d (closed)> set hive.exec.dynamic.partition.mode=nonstrict; DB=tpch_flat_orc_2 No current connection init script execution failed. load-flat.sql https://github.com/hortonworks/hive-testbench/blob/hdp3/settings/load-flat.sql

Ane · ‎02-24-2022

In hive-testbench/tpch-gen/pom.xml, changed the hadoop version and the issue got resolved <dependencies> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>2.4.0</version> <scope>compile</scope> </dependency>

Ane · ‎02-23-2022

I am following the instructions to run hive tpch https://github.com/hortonworks/hive-testbench.git I am using Cloudera Enterprise Trial 7.3.1 I am running into following error. [root@test_mach hive-testbench]# ./tpch-setup.sh 3 + '[' X3 = X ']' + '[' X = X ']' + DIR=/tmp/tpch-generate + '[' 3 -eq 1 ']' + hdfs dfs -mkdir -p /tmp/tpch-generate + hdfs dfs -ls /tmp/tpch-generate/3/lineitem ls: `/tmp/tpch-generate/3/lineitem': No such file or directory + '[' 1 -ne 0 ']' + echo 'Generating data at scale factor 3.' Generating data at scale factor 3. + cd tpch-gen + hadoop jar target/tpch-gen-1.0-SNAPSHOT.jar -d /tmp/tpch-generate/3/ -s 3 WARNING: Use "yarn jar" to launch YARN applications. Exception in thread "main" java.lang.IllegalAccessError: class org.apache.hadoop.hdfs.web.HftpFileSystem cannot access its superinterface org.apache.hadoop.hdfs.web.TokenAspect$TokenManagementDelegator at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) at java.net.URLClassLoader.access$100(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:368) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at java.util.ServiceLoader$1.next(ServiceLoader.java:480) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:3337) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3382) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3422) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:158) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3485) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3453) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:518) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:266) at org.notmysock.tpch.GenTable.genInput(GenTable.java:171) at org.notmysock.tpch.GenTable.run(GenTable.java:98) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.notmysock.tpch.GenTable.main(GenTable.java:54) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:318) at org.apache.hadoop.util.RunJar.main(RunJar.java:232) + hdfs dfs -ls /tmp/tpch-generate/3/lineitem ls: `/tmp/tpch-generate/3/lineitem': No such file or directory + '[' 1 -ne 0 ']' + echo 'Data generation failed, exiting.' Data generation failed, exiting. + exit 1

Online	Offline
Last Visited	‎07-04-2022 10:47 AM

Member Since	‎02-04-2022 11:28 PM
Last Visited	‎07-04-2022 10:47 AM
Posts	14

Cloudera Community

Re: "init script execution failed" error while run...

Re: Beeline equivalent of hive cli -d option

Re: Error while running hive tpch: Exception in t...

'org.apache.hadoop.util.DiskChecker$DiskErrorExcep...

To get all files processed by MR/Spark job

hive query error : Error encountered near token 'T...

Beeline equivalent of hive command line potion: --...

Re: "init script execution failed" error while run...

Re: Beeline equivalent of hive cli -d option

Beeline equivalent of hive cli -d option

"init script execution failed" error while running...

Re: Error while running hive tpch: Exception in t...

Error while running hive tpch: Exception in threa...