Member since
02-04-2022
14
Posts
0
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
867 | 02-24-2022 09:22 AM | |
683 | 02-24-2022 09:17 AM | |
2022 | 02-24-2022 06:14 AM |
06-07-2022
02:56 AM
I am getting following error when I run the tez query(as hdfs user). INFO : Cleaning up the staging area file:/tmp/hadoop/mapred/staging/hdfs1254373830/.staging/job_local1254373830_0002
ERROR : Job Submission failed with exception 'org.apache.hadoop.util.DiskChecker$DiskErrorException(No space available in any of the local directories.)'
org.apache.hadoop.util.DiskChecker$DiskErrorException: No space available in any of the local directories.
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:416)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:165)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:130)
at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:123)
at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:172)
at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:794)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:251)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:576)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:571)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:571)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:562)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:423)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:149)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703)
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224)
at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:316)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:330)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. No space available in any of the local directories. bash-4.2$ls -lh /tmp
lrwxrwxrwx 1 root root 8 Jun 2 13:20 /tmp -> /mnt/tmp /mnt has enough space: /dev/dev1 195G 3.7G 192G 2% /mnt I get error when creating directory /tmp: bash-4.2$ mkdir /tmp/hadoop/mapred/staging/1 mkdir: cannot create directory '/tmp/hadoop/mapred/staging/1': Permission denied
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
HDFS
05-07-2022
02:47 AM
I have a multiple mr/spark job that uses regular expression to filter input files for mr/spark job. Is there any way I can get the files processed by the mr/spark application after the job is completed?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark
-
Apache YARN
-
HDFS
03-17-2022
08:55 AM
I am getting following error while running the tpch query I saw the following link related to error (https://issues.apache.org/jira/browse/HIVE-11427) but not sure how to use this. I am using: Hive 3.1.3000.7.1.6.0-297 Error: Error while compiling statement: FAILED: SemanticException 0:0 Error creating temporary folder on: hdfs://testmach:8020/apps/hive/warehouse/tpch_flat_orc_2.db. Error encountered near token 'TOK_TMP_FILE' (state=42000,code=40000) 0: jdbc:hive2://testmach:10000> use tpch_flat_orc_120;
INFO : Compiling command(queryId=hive_20220317154641_5560d6bf-4866-48b0-a990-9c5ea44f6dbd): use tpch_flat_orc_120
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling command(queryId=hive_20220317154641_5560d6bf-4866-48b0-a990-9c5ea44f6dbd); Time taken: 0.017 seconds
INFO : Executing command(queryId=hive_20220317154641_5560d6bf-4866-48b0-a990-9c5ea44f6dbd): use tpch_flat_orc_120
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=hive_20220317154641_5560d6bf-4866-48b0-a990-9c5ea44f6dbd); Time taken: 0.015 seconds
INFO : OK
No rows affected (0.049 seconds)
0: jdbc:hive2://testmach:10000>
0: jdbc:hive2://testmach:10000>
0: jdbc:hive2://testmach:10000> Beeline version 3.1.3000.7.1.6.0-297 by Apache Hive
0: jdbc:hive2://testmach:10000> use tpch_flat_orc_2;
INFO : Compiling command(queryId=hive_20220317154646_e6389298-4543-45cf-bfbd-7cac999f299c): use tpch_flat_orc_2
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling command(queryId=hive_20220317154646_e6389298-4543-45cf-bfbd-7cac999f299c); Time taken: 0.008 seconds
INFO : Executing command(queryId=hive_20220317154646_e6389298-4543-45cf-bfbd-7cac999f299c): use tpch_flat_orc_2
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=hive_20220317154646_e6389298-4543-45cf-bfbd-7cac999f299c); Time taken: 0.01 seconds
INFO : OK
No rows affected (0.034 seconds)
0: jdbc:hive2://testmach:10000> source tpch_query18.sql;
22/03/17 15:47:09 [main]: WARN conf.HiveConf: HiveConf of name hive.masking.algo does not exist
INFO : Compiling command(queryId=hive_20220317154709_026bc1c2-6bf0-4bd3-8c63-a48d50ea3dba): drop view q18_tmp_cached
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling command(queryId=hive_20220317154709_026bc1c2-6bf0-4bd3-8c63-a48d50ea3dba); Time taken: 0.035 seconds
INFO : Executing command(queryId=hive_20220317154709_026bc1c2-6bf0-4bd3-8c63-a48d50ea3dba): drop view q18_tmp_cached
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=hive_20220317154709_026bc1c2-6bf0-4bd3-8c63-a48d50ea3dba); Time taken: 0.07 seconds
INFO : OK
No rows affected (0.127 seconds)
INFO : Compiling command(queryId=hive_20220317154709_9128d000-bb7e-43c8-876b-074f753c980a): drop table q18_large_volume_customer_cached
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling command(queryId=hive_20220317154709_9128d000-bb7e-43c8-876b-074f753c980a); Time taken: 0.018 seconds
INFO : Executing command(queryId=hive_20220317154709_9128d000-bb7e-43c8-876b-074f753c980a): drop table q18_large_volume_customer_cached
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing command(queryId=hive_20220317154709_9128d000-bb7e-43c8-876b-074f753c980a); Time taken: 0.012 seconds
INFO : OK
No rows affected (0.049 seconds)
INFO : Compiling command(queryId=hive_20220317154709_a234bdff-3fb7-4cb2-a808-17d30baf1a20): create view q18_tmp_cached as
select
l_orderkey,
sum(l_quantity) as t_sum_quantity
from
lineitem
where
l_orderkey is not null
group by
l_orderkey
INFO : Semantic Analysis Completed (retrial = false)
INFO : Created Hive schema: Schema(fieldSchemas:[FieldSchema(name:l_orderkey, type:bigint, comment:null), FieldSchema(name:t_sum_quantity, type:double, comment:null)], properties:null)
INFO : Completed compiling command(queryId=hive_20220317154709_a234bdff-3fb7-4cb2-a808-17d30baf1a20); Time taken: 0.08 seconds
INFO : Executing command(queryId=hive_20220317154709_a234bdff-3fb7-4cb2-a808-17d30baf1a20): create view q18_tmp_cached as
select
l_orderkey,
sum(l_quantity) as t_sum_quantity
from
lineitem
where
l_orderkey is not null
group by
l_orderkey
INFO : Starting task [Stage-1:DDL] in serial mode
INFO : Completed executing command(queryId=hive_20220317154709_a234bdff-3fb7-4cb2-a808-17d30baf1a20); Time taken: 0.033 seconds
INFO : OK
No rows affected (0.153 seconds)
Error: Error while compiling statement: FAILED: SemanticException 0:0 Error creating temporary folder on: hdfs://testmach:8020/apps/hive/warehouse/tpch_flat_orc_2.db. Error encountered near token 'TOK_TMP_FILE' (state=42000,code=40000)
0: jdbc:hive2://testmach:10000>
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
Apache Tez
-
HDFS
02-24-2022
10:42 AM
hive has --database option to select database. --database <databasename> Specify the database to use What is the beeline equivalent of the same?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
02-24-2022
09:22 AM
-d option is not supported in beeline. Issue got resolved using --hivevar
... View more
02-24-2022
08:50 AM
What is the beeline equivalent of hive cli -d option? -d,–define <key=value> Variable substitution to apply to Hive commands.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
02-24-2022
07:50 AM
I am following the instructions to run hive tpch https://github.com/hortonworks/hive-testbench.git I am using Cloudera Enterprise Trial 7.3.1 I solved the initial error(https://community.cloudera.com/t5/Support-Questions/Error-while-running-hive-tpch-Exception-in-thread-quot-main/m-p/336879/thread-id/232418/highlight/false#M232465 ) and tried to rerun the tpch-setup.sh. I am running into following error. + echo 'TPC-H text data generation complete.'
TPC-H text data generation complete.
+ echo 'Loading text data into external tables.'
Loading text data into external tables.
+ runcommand 'beeline -u jdbc:hive2://techmach.com:10000/default -i settings/load-flat.sql -f ddl-tpch/bin_flat/alltables.sql -d DB=tpch_text_2 -d LOCATION=/tmp/tpch-generate/2'
+ '[' X '!=' X ']'
+ beeline -u jdbc:hive2://techmach.com:10000/default -i settings/load-flat.sql -f ddl-tpch/bin_flat/alltables.sql -d DB=tpch_text_2 -d LOCATION=/tmp/tpch-generate/2
+ i=1
+ total=8
+ test 2 -le 1000
+ SCHEMA_TYPE=flat
+ DATABASE=tpch_flat_orc_2
+ MAX_REDUCERS=2600
++ test 2 -gt 2600
++ echo 2
+ REDUCERS=2
+ for t in '${TABLES}'
+ echo 'Optimizing table part (1/8).'
Optimizing table part (1/8).
+ COMMAND='beeline -u jdbc:hive2://techmach.com:10000/default -i settings/load-flat.sql -f ddl-tpch/bin_flat/part.sql -d DB=tpch_flat_orc_2 -d SOURCE=tpch_text_2 -d BUCKETS=13 -d SCALE=2 -d REDUCERS=2 -d FILE=orc'
+ runcommand 'beeline -u jdbc:hive2://techmach.com:10000/default -i settings/load-flat.sql -f ddl-tpch/bin_flat/part.sql -d DB=tpch_flat_orc_2 -d SOURCE=tpch_text_2 -d BUCKETS=13 -d SCALE=2 -d REDUCERS=2 -d FILE=orc'
+ '[' X '!=' X ']'
+ beeline -u jdbc:hive2://techmach.com:10000/default -i settings/load-flat.sql -f ddl-tpch/bin_flat/part.sql -d DB=tpch_flat_orc_2 -d SOURCE=tpch_text_2 -d BUCKETS=13 -d SCALE=2 -d REDUCERS=2 -d FILE=orc
+ '[' 1 -ne 0 ']'
+ echo 'Command failed, try '\''export DEBUG_SCRIPT=ON'\'' and re-running'
Command failed, try 'export DEBUG_SCRIPT=ON' and re-running
+ exit 1 Below is the failing command output when run from the console beeline -u jdbc:hive2://testmach.com:10000/default -i settings/load-flat.sql -f ddl-tpch/bin_flat/part.sql -d DB=tpch_flat_orc_2 -d SOURCE=tpch_text_2 -d BUCKETS=13 -d SCALE=2 -d REDUCERS=2 -d FILE=orc
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/jars/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
WARNING: Use "yarn jar" to launch YARN applications.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/jars/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-7.1.6-1.cdh7.1.6.p0.10506313/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Connecting to jdbc:hive2://testmach.com:10000/default
DB=tpch_flat_orc_2
Running init script settings/load-flat.sql
0: jdbc:hive2://testmach.com:10000/d (closed)> --set hive.enforce.bucketing=true;
0: jdbc:hive2://testmach.com:10000/d (closed)> --set hive.enforce.sorting=true;
0: jdbc:hive2://testmach.com:10000/d (closed)> set hive.exec.dynamic.partition.mode=nonstrict;
DB=tpch_flat_orc_2
No current connection
init script execution failed. load-flat.sql https://github.com/hortonworks/hive-testbench/blob/hdp3/settings/load-flat.sql
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
-
Apache YARN
02-24-2022
06:14 AM
In hive-testbench/tpch-gen/pom.xml, changed the hadoop version and the issue got resolved <dependencies> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>2.4.0</version> <scope>compile</scope> </dependency>
... View more
02-23-2022
02:31 AM
I am following the instructions to run hive tpch https://github.com/hortonworks/hive-testbench.git I am using Cloudera Enterprise Trial 7.3.1 I am running into following error. [root@test_mach hive-testbench]# ./tpch-setup.sh 3 + '[' X3 = X ']' + '[' X = X ']' + DIR=/tmp/tpch-generate + '[' 3 -eq 1 ']' + hdfs dfs -mkdir -p /tmp/tpch-generate + hdfs dfs -ls /tmp/tpch-generate/3/lineitem ls: `/tmp/tpch-generate/3/lineitem': No such file or directory + '[' 1 -ne 0 ']' + echo 'Generating data at scale factor 3.' Generating data at scale factor 3. + cd tpch-gen + hadoop jar target/tpch-gen-1.0-SNAPSHOT.jar -d /tmp/tpch-generate/3/ -s 3 WARNING: Use "yarn jar" to launch YARN applications. Exception in thread "main" java.lang.IllegalAccessError: class org.apache.hadoop.hdfs.web.HftpFileSystem cannot access its superinterface org.apache.hadoop.hdfs.web.TokenAspect$TokenManagementDelegator at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) at java.net.URLClassLoader.access$100(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:368) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at java.util.ServiceLoader$1.next(ServiceLoader.java:480) at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:3337) at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3382) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3422) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:158) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3485) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3453) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:518) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:266) at org.notmysock.tpch.GenTable.genInput(GenTable.java:171) at org.notmysock.tpch.GenTable.run(GenTable.java:98) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at org.notmysock.tpch.GenTable.main(GenTable.java:54) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:318) at org.apache.hadoop.util.RunJar.main(RunJar.java:232) + hdfs dfs -ls /tmp/tpch-generate/3/lineitem ls: `/tmp/tpch-generate/3/lineitem': No such file or directory + '[' 1 -ne 0 ']' + echo 'Data generation failed, exiting.' Data generation failed, exiting. + exit 1
... View more
Labels: