Member since
02-02-2016
4
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3037 | 02-03-2016 04:59 AM |
10-27-2017
02:04 AM
Hi Alex, Thanks for your reply. The jira link you have shared seems to be a pretty old one and there are some comments saying this may be available in impala 2.2 onwards but as per my information cdh 5.5.4 has impala 2.3. it seems this issue was raised 4 years back but still not been implented yet as we are using microstaregy as our bI tool, so doing cross joins with the large tables are not a feasible solution in it. Is this a fundamental constaraint in impala or it may come in a later impala version? it will be great if you can give an hint in which impala version or cdh version this functionality may be valuable. it will be helpfull for sure to the future users. Nisith
... View more
10-26-2017
08:41 AM
Hi All, I am able to run the below query in hive but not in impala. I am using CDH 5.5.4 Is it possible in impala or not? QUERY: Select count(distinct(concat(c1,c2))) as Key, sum(distinct(c3)) as Val FROM test; In HIve it is successfully executed but in impala i am getting the below error. AnalysisException: all DISTINCT aggregate functions need to have the same set of parameters as count(DISTINCT (concat(c1,c2))); deviating function: sum(DISTINCT (c3)) Any suggesstions on how to do it in a better way if not possible in impala will be helpful. thanks Nisith
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala
-
Cloudera Hue
02-03-2016
04:59 AM
Finally Able to debug and fix the issue. The Issue was with the installation as one of the data nodes are having older version of parquet Jars(5.2 cdh distribution). After replacing the jars with the current version jars it was working fine.
... View more
02-02-2016
05:08 AM
Hi, I am Getting java.lang.NoSuchFieldError: INT_8 error when I am trying to execute a spark job using OOzie on Cloudera 5.5.1 version. Any help on this will be appreciated. Please find the error stackstrace below. 16/02/02 17:22:26 WARN TaskSetManager: Lost task 0.0 in stage 20.0 (TID 38, Zlab-physrv1): java.lang.NoSuchFieldError: INT_8
at org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convertField(CatalystSchemaConverter.scala:327)
at org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convertField(CatalystSchemaConverter.scala:312)
at org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter$$anonfun$convertField$1.apply(CatalystSchemaConverter.scala:517)
at org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter$$anonfun$convertField$1.apply(CatalystSchemaConverter.scala:516)
at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51)
at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60)
at scala.collection.mutable.ArrayOps$ofRef.foldLeft(ArrayOps.scala:108)
at org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convertField(CatalystSchemaConverter.scala:516)
at org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convertField(CatalystSchemaConverter.scala:312)
at org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convertField(CatalystSchemaConverter.scala:521)
at org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convertField(CatalystSchemaConverter.scala:312)
at org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter$$anonfun$convert$1.apply(CatalystSchemaConverter.scala:305)
at org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter$$anonfun$convert$1.apply(CatalystSchemaConverter.scala:305)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at org.apache.spark.sql.types.StructType.foreach(StructType.scala:92)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at org.apache.spark.sql.types.StructType.map(StructType.scala:92)
at org.apache.spark.sql.execution.datasources.parquet.CatalystSchemaConverter.convert(CatalystSchemaConverter.scala:305)
at org.apache.spark.sql.execution.datasources.parquet.ParquetTypesConverter$.convertFromAttributes(ParquetTypesConverter.scala:58)
at org.apache.spark.sql.execution.datasources.parquet.RowWriteSupport.init(ParquetTableSupport.scala:55)
at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:277)
at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:251)
at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.<init>(ParquetRelation.scala:94)
at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anon$3.newInstance(ParquetRelation.scala:272)
at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:233)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745) Normally we used to get this error whenever there is some difference on the jars you have used to generate the code and the jars you have used during runtime are missing those elements. Note: When I am trying to submit the Same one using spark-submit command its running fine. Regards Nisith
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Oozie
-
Apache Spark