Member since
02-13-2018
10
Posts
0
Kudos Received
0
Solutions
06-13-2020
07:11 AM
Hi Royles, I am using hiveql for creating the table, altering the table for adding new columns. Doing all the operations like msck repair table,add partition to table everything I am doing from hiveql only.Only we are reading table from sparksql. After reading your reply,I tried to create external table,do msck repair,alter table to add new columns everything from sparksql. I got the below results 1.No results from spark when reading data from table 2.No results from hive shell when reading table 3.If I see the tblproperties,parquet schema is not matching .So there are no results from hiveql and from spark The only solution which I am following till now is(for adding new columns to external tbls) 1.Drop and create table using hiveql from hiveshell with all columns(old + new) 2.add latest partition manually which has data for all new columns added so far apart from beginning creation of table from hiveshell 3.query table from spark.Then check for tblproperties and parquet schema should be reflecting and mapped with hive columns 4.If the schema is not matching like testData in parquet is reflecting as testdata in hive tblproperties then we will get null values form spark 5.If both the schemas are matching,then we can see results from spark 4.then do msck repair which is giving me results in both spark 2.2 and 2.3 But I feel there must be some other way of adding new columns instead of dropping table and recreating it.
... View more
06-12-2020
12:39 AM
Hi All, Need your expertise in understanding and solving the below issue. 1. Data is available in parquet(partitioned) and external table has been created in HIVE and do MSCK. 2.Query from HIVE and SPARK gives results for all columns ----issue starts from here 3.Add new columns in parquet and alter table to add new columns(with same schema and correct datatypes) and do MSCK 4.Reads values from Hive for new columns but from spark 2.3 and 2.2 will give null values ------We also did other way 5.Drop and recreate table with all columns(new and old)---add latest partition manually which has data for all new columns added and then do msck for other partitions ---in somecases it works but somecases same issue giving null values from spark We have seen this is not an issue in spark2.4. Can you please explain why it behaves like this and if any way forward
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
06-03-2019
01:46 PM
from spark or pyspark shell use the below commands to access hive database objects. spark.sql("show databases;") spark.sql("select * from databasename.tablename;") or spark.read.table("databasename.tablename") You can give any query inside spark.sql which will give you results.
... View more
06-03-2019
01:41 PM
Hi All, We are trying to create partitioned views on hive partitioned external tables.Tables are partitioned on customerid,orderid where as view is partitioned on customerid. While trying to add partitions to views, sometimes we will pass already existing customerid.Then we are getting the below error and not understanding why it is not chekcing IF NOT EXISTS and then ignore the partition. Please suggest what is going wrong here and how to solve this. Query : alter view viewname add if not exists partition(customerid=123); the above query when trying to run is giving the below error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.NullPointerException) at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:257) at org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:348) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
... View more
Labels:
- Labels:
-
Apache Hive
02-22-2018
01:19 AM
Thank You.It worked.
... View more
02-19-2018
09:13 AM
Hi, I am running UDF which has sql query .UDF is running fine in Hive.While running in Impala I am getting the below exception. Prettytable cannot resolve string columns values that have embedded tabs. Reverting to tab delimited text output ++++java.lang.ClassNotFoundException: org.apache.hive.jdbc.HiveDriver at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:190) at GetData.evaluate(GetData.java:53) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.impala.hive.executor.UdfExecutor.evaluate(UdfExecutor.java:353) at org.apache.impala.hive.executor.UdfExecutor.evaluate(UdfExecutor.java:288) at org.apache.impala.service.FeSupport.NativeEvalExprsWithoutRow(Native Method) at org.apache.impala.service.FeSupport.EvalExprsWithoutRow(FeSupport.java:172) at org.apache.impala.service.FeSupport.EvalExprWithoutRow(FeSupport.java:130) at org.apache.impala.analysis.LiteralExpr.create(LiteralExpr.java:178) at org.apache.impala.rewrite.FoldConstantsRule.apply(FoldConstantsRule.java:68) at org.apache.impala.rewrite.ExprRewriter.applyRuleBottomUp(ExprRewriter.java:85) at org.apache.impala.rewrite.ExprRewriter.applyRuleRepeatedly(ExprRewriter.java:71) at org.apache.impala.rewrite.ExprRewriter.rewrite(ExprRewriter.java:55) at org.apache.impala.analysis.SelectList.rewriteExprs(SelectList.java:97) at org.apache.impala.analysis.SelectStmt.rewriteExprs(SelectStmt.java:886) at org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:396) at org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:368) at org.apache.impala.service.Frontend.analyzeStmt(Frontend.java:903) at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1052) at org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:156) I changed driver to impala driver and tested but still same error with class not found exception on impala driver is coming. Request your inputs or suggestions. Thank You
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala