About oren_zvuluny

oren_zvuluny · ‎06-07-2017

I'm getting the same results as before. I used an empty directory for alter table. I can do insert and select, but not select order by.

oren_zvuluny · ‎06-07-2017

Hi, I renamed the HDFS directory before running the "alter table" command, so a directory with the new name exists and it has files. The problem is that given the same location string (e.g. 'hdfs:/tmp/a1'), CREATE TABLE and ALTER TABLE behave differently. - The create table expands the location to full URI (e.g. 'hdfs://sandbox.hortonworks.com:8020/tmp/a1'), and everything works fine. - The alter table does not expand the location. Now, with the short location, some operations are failing. For example, I can run INSERT and SELECT commands, but SELECT with ORDER BY fails with the above error. Also, if I use Hue, navigate to a table that was altered and click on "View File Location" - it fails. So the short location works for some operations and doesn't work for others. Thanks

oren_zvuluny · ‎06-05-2017

The directory already exists when I run the alter table. BTW, I can run INSERT, it will create new file in the directory. Also, SELECT without ORDER BY works.

oren_zvuluny · ‎06-05-2017

Hi, When creating Hive table with LOCATION clause, the location is transformed to the full URI: hive> create table a1 (i int) location 'hdfs:/tmp/a1'; Describing the table, the location is: hdfs://sandbox.hortonworks.com:8020/tmp/a1 But, when changing the location with SET LOCATION command: hive> alter table a1 set location 'hdfs:/tmp/a2'; The location remain as typed in the command, i.e. hdfs:/tmp/a2 Now, running SELECT with ORDER BY fails: hive> select * from a1 order by i; Query ID = root_20170604162926_0ef4e586-0eae-4db6-9b4f-0bc35a993187 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapreduce.job.reduces=<number> java.io.IOException: cannot find dir = hdfs://sandbox.hortonworks.com:8020/tmp/a2/000000_0_copy_1 in pathToPartitionInfo: [hdfs:/tmp/a2] at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:352) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:314) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.<init>(CombineHiveInputFormat.java:161) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getCombineSplits(CombineHiveInputFormat.java:490) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:571) at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:328) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:320) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:403) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:140) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1748) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1494) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1291) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1158) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1148) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:233) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) Job Submission failed with exception 'java.io.IOException(cannot find dir = hdfs://sandbox.hortonworks.com:8020/tmp/a2/000000_0_copy_1 in pathToPartitionInfo: [hdfs:/tmp/a2])' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. cannot find dir = hdfs://sandbox.hortonworks.com:8020/tmp/a2/000000_0_copy_1 in pathToPartitionInfo: [hdfs:/tmp/a2] What is the recommended way to enter the location in ALTER TABLE? The full location is not always available, e.g. when using Knox or HDFS high availability. Thanks

oren_zvuluny · ‎05-28-2017

Hi @akapratwar, This is only an example, other parameter names where changed as well. My application may work with different Hadoop versions and different ODBC drivers, depends on each environment. What I'm trying to figure out is which Hadoop version introduced the parameter names change and why older ODBC driver do not require any configuration of whitelist on Hive. Thanks

oren_zvuluny · ‎05-25-2017

Hi, I’m using HDP 2.6 sandbox. Connecting to Hive using ODBC and running this command: set hive.enforce.bucketing=true I noticed some strange behavior: Using ODBC driver version 2.1.2.1002 – works fine, without additional Hive configuration Using ODBC driver version 2.1.5.1006 – doesn't work, requires adding the parameter to whitelist append list: [HY000][80][Hortonworks][Hardy] (80) Syntax or semantic analysis error thrown in server while executing query. Error message from server: Error while processing statement: Cannot modify enforce.bucketing at runtime. It is not in list of params that are allowed to be modified at runtime I added hive.enforce.bucketing to the whitelist and it still didn't work. Then I noticed that the error is about enforce.bucketing - without the "hive." prefix. So I added the shorter name to whitelist - and now both "enforce.bucketing" and "hive.enforce.bucketing" can be set via ODBC. This is just an example, I found other parameters that changed to shorter names as well. In previous HDP versions, the long names were used. Do you know the exact version in which this change took place and why? And how come the old ODBC driver is working fine without adding anything to the whitelist? Thanks

Online	Offline
Last Visited	‎08-15-2018 02:02 PM

Member Since	‎05-25-2017 04:11 PM
Last Visited	‎08-15-2018 02:02 PM
Posts	6

Cloudera Community

Re: Table location - CREATE TABLE vs. ALTER TABLE....

Re: Table location - CREATE TABLE vs. ALTER TABLE....

Re: Table location - CREATE TABLE vs. ALTER TABLE....

Table location - CREATE TABLE vs. ALTER TABLE...SE...

Re: Hive parameters names changed? (hive.enforce.b...

Hive parameters names changed? (hive.enforce.bucke...