Member since
05-25-2017
6
Posts
0
Kudos Received
0
Solutions
06-07-2017
04:28 PM
I'm getting the same results as before. I used an empty directory for alter table. I can do insert and select, but not select order by.
... View more
06-07-2017
11:42 AM
Hi,
I renamed the HDFS directory before running the "alter table" command, so a directory with the new name exists and it has files.
The problem is that given the same location string (e.g. 'hdfs:/tmp/a1'), CREATE TABLE and ALTER TABLE behave differently.
- The create table expands the location to full URI (e.g. 'hdfs://sandbox.hortonworks.com:8020/tmp/a1'), and everything works fine.
- The alter table does not expand the location. Now, with the short location, some operations are failing. For example, I can run INSERT and SELECT commands, but SELECT with ORDER BY fails with the above error. Also, if I use Hue, navigate to a table that was altered and click on "View File Location" - it fails. So the short location works for some operations and doesn't work for others.
Thanks
... View more
06-05-2017
02:28 PM
The directory already exists when I run the alter table. BTW, I can run INSERT, it will create new file in the directory. Also, SELECT without ORDER BY works.
... View more
06-05-2017
12:31 PM
Hi, When creating Hive table with LOCATION clause, the location is transformed to the full URI: hive> create table a1 (i int) location 'hdfs:/tmp/a1'; Describing the table, the location is: hdfs://sandbox.hortonworks.com:8020/tmp/a1 But, when changing the location with SET LOCATION command: hive> alter table a1 set location 'hdfs:/tmp/a2'; The location remain as typed in the command, i.e. hdfs:/tmp/a2 Now, running SELECT with ORDER BY fails:
hive> select * from a1 order by i; Query ID = root_20170604162926_0ef4e586-0eae-4db6-9b4f-0bc35a993187
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
java.io.IOException: cannot find dir = hdfs://sandbox.hortonworks.com:8020/tmp/a2/000000_0_copy_1 in pathToPartitionInfo: [hdfs:/tmp/a2]
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:352)
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:314)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.<init>(CombineHiveInputFormat.java:161)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getCombineSplits(CombineHiveInputFormat.java:490)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:571)
at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:328)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:320)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:403)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:140)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1748)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1494)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1291)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1158)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1148)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Job Submission failed with exception 'java.io.IOException(cannot find dir = hdfs://sandbox.hortonworks.com:8020/tmp/a2/000000_0_copy_1 in pathToPartitionInfo: [hdfs:/tmp/a2])'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. cannot find dir = hdfs://sandbox.hortonworks.com:8020/tmp/a2/000000_0_copy_1 in pathToPartitionInfo: [hdfs:/tmp/a2] What is the recommended way to enter the location in ALTER TABLE? The full location is not always available, e.g. when using Knox or HDFS high availability. Thanks
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive
05-28-2017
06:33 AM
Hi @akapratwar, This is only an example, other parameter names where changed as well. My application may work with different Hadoop versions and different ODBC drivers, depends on each environment. What I'm trying to figure out is which Hadoop version introduced the parameter names change and why older ODBC driver do not require any configuration of whitelist on Hive. Thanks
... View more
05-25-2017
04:21 PM
Hi, I’m using HDP 2.6 sandbox. Connecting to Hive using ODBC and running this command: set hive.enforce.bucketing=true I noticed some strange behavior:
Using
ODBC driver version 2.1.2.1002 – works fine, without additional Hive configuration Using
ODBC driver version 2.1.5.1006 – doesn't work, requires adding the
parameter to whitelist append list: [HY000][80][Hortonworks][Hardy]
(80) Syntax or semantic analysis error thrown in server while executing query.
Error message from server: Error while processing statement: Cannot modify enforce.bucketing
at runtime. It is not in list of params that are allowed to be modified at
runtime I added hive.enforce.bucketing to the whitelist and it still didn't work. Then I noticed that the error is about enforce.bucketing - without the "hive." prefix. So I added the shorter name to whitelist - and now both "enforce.bucketing"
and "hive.enforce.bucketing" can be set via ODBC. This is just an example, I found other parameters that changed to shorter names as well. In previous HDP versions, the long names were used. Do you know the exact version in which this change took place and why? And how come the old ODBC driver is working fine without adding anything to the whitelist? Thanks
... View more
Labels: