Member since
03-31-2016
33
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
750 | 07-19-2016 11:58 AM |
03-26-2019
08:34 AM
I have a hql file with 15 hive queries in it. I am using Oozie to run the hql file. Whenever there is an error, the retry mechanism would help the oozie retry running the hql file. But I do not want to re-run the whole file again. Is there a way we can run from the query it encountered error?
... View more
Labels:
02-22-2017
06:09 AM
Hello, We have a requirement where two of our clusters is sharing the same Hive metastore. We are storing the data in S3. Due to some requirement we had to move the S3 buckets. After this we repaired the tables in one cluster and everything is running fine. On the other cluster we did not repair the tables assuming that running on the first cluster should be fine. But on the second cluster the queries are throwing error as "S3 path does not exist". So, is it not sufficient to repair the tables from one cluster which updates the hive metastore? Is it required to repair the same tables from other cluster as well?
... View more
Labels:
01-12-2017
10:18 AM
@Rahul Pathak All the OS users use a ppk file to login. Is there a way to provide the ppk file either in beeline or squirrel?
... View more
01-12-2017
09:38 AM
Hello, Currently, I am able to login into my server using Squirrel by just providing the jdbc connection string and username Eg: jdbc:hive2://localhost:10000/default Username: hive This way, anybody can connect to Hive. I want to restrict this and make each user provide their own username along with the password. Is there a way to achieve this? I have not setup Kerberos yet but is there anyother way to achieve this without setting up Kerberos?
... View more
Labels:
11-03-2016
05:24 AM
Hello, Earlier we used to point the Hive's external table's location to S3. We now have a requirement to point it to a local filesystem like /tmp etc but not HDFS. Can this be achieved in Hive?
... View more
Labels:
09-09-2016
10:26 AM
Hello, We are using HDP 2.2.9 which has Hive 0.14 version. We are using Hive for ETL. We have a query which if run on AWS EMR takes half the time when compared to HDP cluster. We have compared all the Hive properties between both the cluster and tried matching all of them. The capacity of HDP cluster is more than EMR cluster. EMR uses MR and HDP uses Tez. Hence the processing is quiet fast in HDP. The reducer phase finishes quiet early but creation of the partitions takes huge time(almost double) when run usingh HDP's Hive. We are using orc format and zlib compression. Are there any properties which affect the creation and write performance of partitions? How can we improvise and bring down the total time? We have followed all the recommendations from many posts here including https://community.hortonworks.com/articles/22419/hive-on-tez-performance-tuning-determining-reducer.html but we are still not able to improve the performance.
... View more
Labels:
08-22-2016
05:55 AM
@Sunile Manjee, it uses map-reduce. I have gone through the post earlier. Is there a similar configuration for map-reduce?
... View more
08-22-2016
05:44 AM
Hello, I am launching 7 sqoop jobs in parallel which launches 14 containers. Even after the jobs are Finished and Final Status is Succeeded in RM, I see that there are still 7 more containers running holding up the resources. They stop only after 10 minutes. Is there a configuration which is affecting this? I see the part_success files getting generated before itself. I am not able to figure out what extra work it is doing in these 10 minutes.
... View more
Labels:
08-18-2016
08:58 AM
Hi @Kuldeep Kulkarni, It took 5-10 seconds to change the status from ACCEPTED to RUNNING for all the 7 jobs.
... View more
08-18-2016
06:58 AM
1 Kudo
Hi @shiremath If I clone then I will be concerned about the ip addresses in the HDP configuration files. I am badly stuck at this point.
... View more
08-18-2016
06:43 AM
Hello @shiremath My motive to clone is because of the configurations, user accounts and the OS changes that I have made in the existing cluster which I want in the new cluster as well. I am not sure if the template in cloudbreak would serve my purpose.
... View more
08-18-2016
06:09 AM
Hi @Kuldeep Kulkarni There is only single queue which is configured as below. All I am running is just 7 jobs with so many resources. Do you still feel anything is missing in the below configuration or should I look at the mapreduce side? Absolute Capacity:100.0% Absolute Max Capacity:100.0% Max Applications:10000 Max Applications Per User:10000 Max Application Master Resources:<memory:614400, vCores:276> Max Application Master Resources Per User:<memory:614400, vCores:276> Configured Capacity:100.0% Configured Max Capacity:100.0% Configured Minimum User Limit Percent:100% Configured User Limit Factor:1.0 Accessible Node Labels:* Preemption:disabled
... View more
08-18-2016
05:51 AM
1 Kudo
Hello, My question might sound naive but is there a way to clone an existing HDP cluster in AWS. Using Cloud Controller, we can create a new cluster and then clone it. But is there a way to register an existing cluster in Cloud Controller and then clone it OR directly clone HDP cluster.
... View more
08-18-2016
04:49 AM
Hello @Kuldeep Kulkarni Before implementing the changes, I did a small test. I created a normal workflow(without any memory configurations) to sqoop 7 tables in parallel using fork and join method in oozie. This ran for 11 minutes. I then created a shell script to launch 7 jobs in parallel. This too ran for 11 minutes. One major difference that I observed is that in shell script, some jobs finish in 4 minutes some in 8 minutes and some took 11 minutes. In oozie, as we are running in a batch using fork and join, the other jobs even though they are finished, will wait until other jobs in the batch finish. Now here is the catch. If I run any of these 7 jobs individually, it doesn't take more than 4 minutes to finish. As we are spawning the jobs in parallel using shell script, it should finish in 4 minutes. But instead 2 jobs out of 7 takes 11 minutes to finish. Below are my setting of yarn-site.xml and mapred-site.xml. I am really not sure where am I missing. Can you help me see this through. Mapred-site.xml <em><property>
<name>mapreduce.map.memory.mb</name>
<value>3072</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx2455m</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>8192</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx6553m</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx4096m</value>
</property>
</em> Yarn-site.xml <em><property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>122880</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>122880</value>
</property>
</em>
... View more
08-17-2016
05:30 AM
Hello @Ameet Paranjape I had mentioned earlier that I have drilled down the issue. My mistake, I was wrong. The issue is still there. There is no problem with the available resources as it has 600 GB of RAM and we are running only 10 jobs in parallel. Configuring the fair scheduler didn't help. Does it have anything to do with the Oozie configuration?
... View more
08-12-2016
08:23 AM
Hello @Ameet Paranjape, The fair scheduler too was taking the same amount of time. I drilled down the root cause. I have created another question to address the issue here
... View more
08-06-2016
06:43 AM
Hello @Ameet Paranjape I had setup two queues in Capacity scheduler where all the oozie launchers and oozie actions were separated but still it took the same time. So, is it still really required to have a fair scheduler to make this thing working?
... View more
08-05-2016
10:18 AM
Hello, We are running Sqoop-Action using Oozie. If we run a single sqoop job through command line, it is finishing in 5 minutes but when we are scheduling through Oozie, it is taking 15 minutes. We are planning to run 50 sqoop jobs in parallel. We tried 10 jobs and it too is taking around 15 minutes to finish where as the avg time to finish a single job is around 3-4 minutes. While inspecting the logs, we found the Heart Beat issue. I have already given a lot of memory but the issue is still there. Below are the configurations of workflow.xml, yarn-site.xml and core-site.xml Logs 4974 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
5007 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead.
5016 [uber-SubtaskRunner] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
5036 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Using default fetchSize of 1000
5036 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation
5445 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.OracleManager - Time zone has been set to GMT
5513 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT * FROM RMS.RMS_DXH_EVENT_LOG WHERE INSERT_DATETIME >= to_timestamp('2016-05-28 11', 'YYYY-MM-DD HH24')AND INSERT_DATETIME < to_timestamp('2016-05-28 12', 'YYYY-MM-DD HH24') AND (1 = 0)
5524 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT * FROM RMS.RMS_DXH_EVENT_LOG WHERE INSERT_DATETIME >= to_timestamp('2016-05-28 11', 'YYYY-MM-DD HH24')AND INSERT_DATETIME < to_timestamp('2016-05-28 12', 'YYYY-MM-DD HH24') AND (1 = 0)
5583 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /opt/hadoop/hadoop-2.7.2
6869 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-hadoop/compile/cad1dab0e45211fb2421690860d98843/QueryResult.jar
6879 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Beginning query import.
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
690272 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Transferred 3.1365 MB in 683.365 seconds (4.7 KB/sec)
690277 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Retrieved 5057 records.
<<< Invocation of Sqoop command completed <<<
Hadoop Job IDs executed by Sqoop: job_1470385625721_0038
<<< Invocation of Main class completed <<<
Workflow.xml <?xml version="1.0" encoding="UTF-8" standalone="no"?>
<workflow-app xmlns="uri:oozie:workflow:0.4" name="oozie_batch_type_wf">
<start to="RMS_DXH_EVENT_LOG6"/>
<action name="RMS_DXH_EVENT_LOG6">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>oozie.launcher.mapreduce.map.memory.mb</name>
<value>3072</value>
</property>
<property>
<name>oozie.launcher.mapreduce.reduce.memory.mb</name>
<value>6144</value>
</property>
<property>
<name>oozie.launcher.mapreduce.child.java.opts</name>
<value>-Xmx8g</value>
</property>
<property>
<name>oozie.launcher.mapred.job.queue.name</name>
<value>default</value>
</property>
</configuration>
<arg>import</arg>
<arg>--connect</arg> Yarn-site.xml <property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>122880</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>55</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>5</value>
</property>
Mapred-site.xml <property>
<name>mapreduce.map.memory.mb</name>
<value>3072</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx2455m</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>6144</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx4915m</value>
</property>
<property>
<name>mapreduce.job.maps</name>
<value>10</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>10</value>
</property>
... View more
Labels:
07-19-2016
11:58 AM
Hello, From the logs, I found that it was stuck updating the hdfs location. So, I updated the hdfs location manually by running the command metatool -updateLocation After that I was able to bring up the Hiveserver2 service. Best Regards, Rinku Singh.
... View more
07-19-2016
09:51 AM
Hello @Mukesh Kumar I don't think increasing the timeout number will help here. There must be something that is stopping it from start. From the ambari-ui, I could see the below error: Connection failed on host ip-172-31-31-251.us-west-2.compute.internal:10000 (Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/alerts/alert_hive_thrift_port.py", line 200, in execute
check_command_timeout=int(check_command_timeout))
File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/hive_check.py", line 74, in check_thrift_port_sasl
timeout=check_command_timeout)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
raise Fail(err_msg)
Fail: Execution of '! beeline -u 'jdbc:hive2://ip-172-31-31-251.us-west-2.compute.internal:10000/;transportMode=binary' -e '' 2>&1| awk '{print}'|grep -i -e 'Connection refused' -e 'Invalid URL'' returned 1. Error: Could not open client transport with JDBC Uri: jdbc:hive2://ip-172-31-31-251.us-west-2.compute.internal:10000/;transportMode=binary: java.net.ConnectException: Connection refused (state=08S01,code=0)
Error: Could not open client transport with JDBC Uri: jdbc:hive2://ip-172-31-31-251.us-west-2.compute.internal:10000/;transportMode=binary: java.net.ConnectException: Connection refused (state=08S01,code=0)
)
... View more
07-19-2016
08:44 AM
Hello, I am trying to start Hiveserver2 but it is timing out with the error: "Python script has been killed due to timeout after waiting 900 secs" Tried to find something in the log file but there was no error. Attaching the log file for reference. Best Regards, Rinku Singh.
... View more
Labels:
07-15-2016
05:10 AM
Hi @Sindhu I finally had to add the parameter javax.jdo.option.ConnectionPassword in the hive-site.xml and re-run it. Fixed all the issues and everything is working now. Thank you so much for your help and pointers. Best Regards, Rinku Singh.
... View more
07-14-2016
10:17 AM
Hello @Sindhu I did the following. a. Created a new MySQL instance and migrated the tables into this new database b. Pointed the hive metastore to the new MySQL instance. As expected, the hive metastore service didn't come up. c. I ran the upgrade script command to upgrade the MySQL database schema. d. I restarted the hive metastore service and it came up. e. Now as the schema versions are same, I migrated the tables from the new MySQL database to Postgres database. f. I am now trying to start the hiveserver2 but it is not starting but I am able to login into hive command prompt. g. I also ran the following command [rsingh01@ip-172-31-31-251 bin]$ ./schematool -info -dbType postgres -userName root -passWord password -verbose
WARNING: Use "yarn jar" to launch YARN applications.
Metastore connection URL: jdbc:postgresql://xxxxxxxxxxxxxxxxxxxxxxxxx/cirrus3
Metastore Connection Driver : org.postgresql.Driver
Metastore connection User: root
Hive distribution version: 1.2.1000
Metastore schema version: 1.2.1000
schemaTool completed h. I tried querying the tables but now it is showing the following error: hive> select * from rms_customer;
FAILED: SemanticException Unable to determine if hdfs://xxxxxxxxxxxxxxxxxxxxxxxxxxxx:8020/user/hive/warehouse/rms.db/rms_customer is encrypted: java.lang.IllegalArgumentException: Wrong FS: hdfs://xxxxxxxxxxxxxxxxxxxxxxxxxxxx:8020/user/hive/warehouse/rms.db/rms_customer, expected: hdfs://xxxxxxxxxxxxxxxxxxxxxxxxxxxx:8020 h. I followed the link https://issues.apache.org/jira/browse/HIVE-11116 and tried to list the FSRoot by running the command metatool -listFSRoot but this time it is throwing the error: =========================== [rsingh01@ip-172-31-31-251 bin]$ ./metatool -listFSRoot
WARNING: Use "yarn jar" to launch YARN applications.
Initializing HiveMetaTool..
16/07/14 10:05:43 INFO metastore.ObjectStore: ObjectStore, initialize called
16/07/14 10:05:43 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/07/14 10:05:43 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/07/14 10:05:44 ERROR Datastore.Schema: Failed initialising database.
Unable to open a test connection to the given database. JDBC url = jdbc:postgresql://xxxxxxxxxxxxxxxxxxxxxxxx/cirrus1, username = root. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
org.postgresql.util.PSQLException: FATAL: password authentication failed for user "root" =========================== I am totally stuck now Best Regards, Rinku Singh.
... View more
07-14-2016
07:29 AM
Hi @Sindhu I found the reason why it says --- relation "compaction_queue" does not exist In postgres, the command will not work: ALTER TABLE COMPACTION_QUEUE ADD COLUMN CQ_HIGHEST_TXN_ID bigint; But the below command will work: ALTER TABLE "COMPACTION_QUEUE" ADD COLUMN "CQ_HIGHEST_TXN_ID" bigint; The table name needs to be in double quotes. As all the upgrade scripts are written normally without double quotes, it will not work. Trying to figure out a way how we can upgrade the database schema version with all these restrictions in postgres.
Best Regards, Rinku Singh.
... View more
07-14-2016
05:28 AM
Hi @Sindhu I ran the above command and found that the database schema version is not compatible with hive version ========================================== [rsingh01]$ ./schematool -info -dbType postgres -userName root -passWord password -verbose
WARNING: Use "yarn jar" to launch YARN applications.
Metastore connection URL: jdbc:postgresql://xxxxxxxxxxxxxxxxxxxxxx/xxxxx Metastore Connection Driver : org.postgresql.Driver
Metastore connection User: root
Hive distribution version: 1.2.1000
Metastore schema version: 1.2.0
org.apache.hadoop.hive.metastore.HiveMetaException: Metastore schema version is not compatible. Hive Version: 1.2.1000, Database Schema Version: 1.2.0
org.apache.hadoop.hive.metastore.HiveMetaException: Metastore schema version is not compatible. Hive Version: 1.2.1000, Database Schema Version: 1.2.0
at org.apache.hive.beeline.HiveSchemaTool.assertCompatibleVersion(HiveSchemaTool.java:196)
at org.apache.hive.beeline.HiveSchemaTool.showInfo(HiveSchemaTool.java:140)
at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:501)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
*** schemaTool failed *** ============================================ I then tried upgrading the database schema version by running the below command. The dryRun is going through successfully but now I am stuck when I am doing the actual execution of the script ============================================= [rsingh01]$ ./schematool -dbType postgres -userName root -passWord password -upgradeSchema -dryRun
WARNING: Use "yarn jar" to launch YARN applications.
Metastore connection URL: jdbc:postgresql://xxxxxxxxxxxxxxxxxxxxxx/xxxxx Metastore Connection Driver : org.postgresql.Driver
Metastore connection User: root
Starting upgrade metastore schema from version 1.2.0 to 1.2.1000
Upgrade script upgrade-1.2.0-to-1.2.1000.postgres.sql
schemaTool completed [rsingh01]$ ./schematool -dbType postgres -userName root -passWord password -upgradeSchema
WARNING: Use "yarn jar" to launch YARN applications.
Metastore connection URL: jdbc:postgresql://xxxxxxxxxxxxxxxxxxxxxx/xxxxx Metastore Connection Driver : org.postgresql.Driver
Metastore connection User: root
Starting upgrade metastore schema from version 1.2.0 to 1.2.1000
Upgrade script upgrade-1.2.0-to-1.2.1000.postgres.sql
Error: ERROR: relation "compaction_queue" does not exist (state=42P01,code=0)
org.apache.hadoop.hive.metastore.HiveMetaException: Upgrade FAILED! Metastore state would be inconsistent !!
*** schemaTool failed ***
... View more
07-13-2016
01:00 PM
Hi @Sindhu Let me put this in much clear way. There are two databases. 1. I created a database called Test. I then started the Hive service and it came up properly. I then saw that 56 tables were created which were not present earlier. It's obvious that it would have run the schemaTool -initSchema command. 2. Later I created one more database and I copied the 56 tables from MySQL to Postgres(in the new database). Changed the database in Ambari to point to this new database and tried restarting the Hive service. Then I got this error. True that the Hive metastore is trying to run schemaTool -initSchema command in second scenario as well and I want to understand what triggers it to run the schemaTool -initSchema command and how can I resolve it so that I can successfully migrate the metastore and start the Hive service. Best Regards, Rinku Singh.
... View more
07-13-2016
12:18 PM
Hi @Sindhu Just to give more background. I created a new database in postgres which in turn creates a new schema called 'public' with no tables in it. As soon as the connection is made by hive for the first time, it internally runs scripts and creates around 56 tables in the public schema. This applies to Mysql db as well. The issue comes when I have already migrated those 56 tables present in MySQL database to postgres database and then try to connect, I get the above error. Best Regards, Rinku Singh.
... View more
07-13-2016
12:07 PM
Hi @Sindhu But why should we run that command. I am not upgrading the metastore but migrating it from MySql to Postgres. Does this still needs to be run? Best Regards, Rinku Singh
... View more
07-13-2016
11:20 AM
Hello, I have a EMR cluster and the hive metastore is connected to MySQL RDS instance. I am now moving to Hortonworks(v2.4.2 with Ambari 2.2) and with that I also wanted to move the hive metastore to Postgres RDS instance. But whenever I migrate the data and try connecting the hive metastore to that schema, it throws the below error. It's working fine for the new database. ================================= stderr:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/hive_metastore.py", line 245, in <module>
HiveMetastore().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
method(env)
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 530, in restart
self.start(env, upgrade_type=upgrade_type)
File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/hive_metastore.py", line 58, in start
self.configure(env)
File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/hive_metastore.py", line 72, in configure
hive(name = 'metastore')
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
return fn(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/hive.py", line 296, in hive
user = params.hive_user
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of 'export HIVE_CONF_DIR=/usr/hdp/current/hive-metastore/conf/conf.server ; /usr/hdp/current/hive-metastore/bin/schematool -initSchema -dbType postgres -userName root -passWord [PROTECTED]' returned 1. WARNING: Use "yarn jar" to launch YARN applications.
Metastore connection URL: jdbc:postgresql://cirrus.c9xp5ox1vhs8.us-west-2.rds.amazonaws.com:5432/cirrus1
Metastore Connection Driver : org.postgresql.Driver
Metastore connection User: root
Starting metastore schema initialization to 1.2.1000
Initialization script hive-schema-1.2.1000.postgres.sql
Error: ERROR: relation "BUCKETING_COLS" already exists (state=42P07,code=0)
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
*** schemaTool failed ***
stdout:
2016-07-13 08:55:20,106 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.4.2.0-258
2016-07-13 08:55:20,106 - Checking if need to create versioned conf dir /etc/hadoop/2.4.2.0-258/0 ==========================================
... View more
Labels:
04-11-2016
02:22 AM
1 Kudo
Hello, We are trying to query a table which was created in hive using the OpenCSVSerde but we are hitting the below error. As far as we know, this comes default with CDH installation and Impala should support it. Any reason why we are not able to query the table? Query: select * from master_staging.rms_dxc_data_mc_cal_reps limit 5 ERROR: AnalysisException: Failed to load metadata for table: 'master_staging.rms_dxc_data_mc_cal_reps' CAUSED BY: TableLoadingException: Failed to load metadata for table: master_staging.rms_dxc_data_mc_cal_reps CAUSED BY: InvalidStorageDescriptorException: Impala does not support tables of this type. REASON: SerDe library 'org.apache.hadoop.hive.serde2.OpenCSVSerde' is not supported.
... View more