Member since
01-07-2020
64
Posts
1
Kudos Received
0
Solutions
11-23-2022
07:38 AM
Hi @Shahrukh_shaikh. I do not have them now. What do you mean data issue? When I run theough terminal everything runs smoothly
... View more
11-23-2022
06:29 AM
I have a job which runs a hive query inside. When it comes the time for the query Oozie throws this error:
Error while compiling statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex re-running, vertexName=Map 1, vertexId=vertex_1668428709182_0049_1_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1668428709182_0049_1_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1668428709182_0049_1_00Vertex failed, vertexName=Map 1, vertexId=vertex_1668428709182_0049_1_00, diagnostics=[Vertex vertex_1668428709182_0049_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE, Vertex vertex_1668428709182_0049_1_00 [Map 1] failed as task task_1668428709182_0049_1_00_000000 failed after vertex succeeded.]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
I can not understand a lot of this error but when I run the job through terminal it ends successfully.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Oozie
11-22-2022
10:54 PM
APAI am about to upgrade from cdh to cdp and I have some questions regarding new version of Hive. Until now I used to have hive as etl service because it is more stable but slower than impala. My tables that bi users see are in impala. My questions are: 1) Is hive 3 fast enough to compete impala ?
2) In case of bi use is it more appropriate to point hive or impala(I read that hive 3 uses cache and makes bi repeated requests faster)?
3) In case of kafka flow, is it appropriate to create an acid table in hive 3 and store the fetched data live ?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala
11-18-2022
01:34 AM
I am trying to run an impala shell and I receive the below error: Error connecting: TypeError, __init__() got an unexpected keyword argument 'ssl_version' This happens after the upgrade to 3.4 impala version.
... View more
Labels:
- Labels:
-
Apache Impala
06-20-2022
11:18 PM
Hi, I have some schedulers in OOZIE through hue and some of them some times fail. However when I run them manually after, they end successfully. Is there any way to put retry policy in my WFs? Here is the error that I am taking: Exit code of the Shell command 1 <<< Invocation of Shell command completed <<< java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:410) at org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:55) at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:223) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:217) at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:153) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:141) Caused by: org.apache.oozie.action.hadoop.LauncherMainException at org.apache.oozie.action.hadoop.ShellMain.run(ShellMain.java:76) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:104) at org.apache.oozie.action.hadoop.ShellMain.main(ShellMain.java:63) ... 16 more Failing Oozie Launcher, Main Class [org.apache.oozie.action.hadoop.ShellMain], exit code [1] Oozie Launcher, uploading action data to HDFS sequence file: Stopping AM
... View more
Labels:
- Labels:
-
Apache Oozie
02-21-2022
12:22 AM
I am trying to run a script in oozie and every time I receive the below error regarding impala.dbapi. The module is inserted correctly in the script. Stdoutput Traceback (most recent call last): Stdoutput File "/tmp/sorting_table.py", line 8, in <module> Stdoutput from impala.dbapi import connect Stdoutput ImportError: No module named impala.dbapi Exit code of the Shell command 1 <<< Invocation of Shell command completed <<< java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:410) at org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:55) at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:223) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:217) at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:153) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:141) Caused by: org.apache.oozie.action.hadoop.LauncherMainException at org.apache.oozie.action.hadoop.ShellMain.run(ShellMain.java:76) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:104) at org.apache.oozie.action.hadoop.ShellMain.main(ShellMain.java:63) Script import libraries: from pyspark import SparkContext from pyspark.sql import SparkSession from datetime import datetime,timedelta import ssl from impala.dbapi import connect import thrift_sasl import os
... View more
Labels:
- Labels:
-
Apache Impala
12-13-2021
11:21 PM
I have an ETL flow which transfers data from a hive table to another through pyspark. The tables are partitioned. Although I see that in the partition's path in HDFS there are small parquet files. I want to ask: 1)How can I merge these files? 2)Is there any max size or recommended size for hive partitions?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark
12-10-2021
02:55 AM
Hi, I want to create a hive table which will store data with orc format and snappy compression. Will power bi be able to read from that table? Also do you suggest any other format/compression for my table?
... View more
Labels:
- Labels:
-
Apache Hive
11-08-2021
12:49 AM
Hi, What is the difference in impala executors and impala coordinators ? Which one shall I increase in order to run my query faster ? Thanks in advance
... View more
Labels:
- Labels:
-
Apache Impala