Member since
01-07-2020
64
Posts
1
Kudos Received
0
Solutions
11-29-2022
02:01 AM
I want to download specific wfs from oozie in hue 4 but I can not find such an option. Can you please help?
... View more
Labels:
- Labels:
-
Apache Oozie
11-23-2022
07:38 AM
Hi @Shahrukh_shaikh. I do not have them now. What do you mean data issue? When I run theough terminal everything runs smoothly
... View more
11-23-2022
06:29 AM
I have a job which runs a hive query inside. When it comes the time for the query Oozie throws this error:
Error while compiling statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex re-running, vertexName=Map 1, vertexId=vertex_1668428709182_0049_1_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1668428709182_0049_1_00Vertex re-running, vertexName=Map 1, vertexId=vertex_1668428709182_0049_1_00Vertex failed, vertexName=Map 1, vertexId=vertex_1668428709182_0049_1_00, diagnostics=[Vertex vertex_1668428709182_0049_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE, Vertex vertex_1668428709182_0049_1_00 [Map 1] failed as task task_1668428709182_0049_1_00_000000 failed after vertex succeeded.]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
I can not understand a lot of this error but when I run the job through terminal it ends successfully.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Oozie
11-22-2022
10:54 PM
APAI am about to upgrade from cdh to cdp and I have some questions regarding new version of Hive. Until now I used to have hive as etl service because it is more stable but slower than impala. My tables that bi users see are in impala. My questions are: 1) Is hive 3 fast enough to compete impala ?
2) In case of bi use is it more appropriate to point hive or impala(I read that hive 3 uses cache and makes bi repeated requests faster)?
3) In case of kafka flow, is it appropriate to create an acid table in hive 3 and store the fetched data live ?
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Impala
11-18-2022
01:34 AM
I am trying to run an impala shell and I receive the below error: Error connecting: TypeError, __init__() got an unexpected keyword argument 'ssl_version' This happens after the upgrade to 3.4 impala version.
... View more
Labels:
- Labels:
-
Apache Impala
11-16-2022
05:00 AM
I want to upgrade from cdh to cdp but I have many wfs in oozie in hue. Is there a risk of losing them after the upgrade? How can I save these wfs in order to upload them again after the update? Thanks in advance
... View more
Labels:
- Labels:
-
Apache Oozie
07-08-2022
06:48 AM
I have a table in impala and I want every day to check the source table with sqoop to see if there are any missing ids. For this purpose I have done:
sqoop import to a staging table all the ids from the impala table
select id from sqoop_table where id not in(select id impala_table)
save the result to a .txt
create a var and store the seded .txt in order to make the results from vertical to horizontal.
From this step I have issues. When I try to parse this var in sqoop to fetch only the missing ids it throws me an error that argument is list too long.
The thing is that I can not change the max capacity of vars. The average amount of ids for 2 days is 40k
Is there any other way to compare the remote table with my impala table and fetch only the missing records?
... View more
Labels:
- Labels:
-
Apache Impala
-
Apache Sqoop
06-20-2022
11:18 PM
Hi, I have some schedulers in OOZIE through hue and some of them some times fail. However when I run them manually after, they end successfully. Is there any way to put retry policy in my WFs? Here is the error that I am taking: Exit code of the Shell command 1 <<< Invocation of Shell command completed <<< java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:410) at org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:55) at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:223) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:217) at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:153) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:141) Caused by: org.apache.oozie.action.hadoop.LauncherMainException at org.apache.oozie.action.hadoop.ShellMain.run(ShellMain.java:76) at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:104) at org.apache.oozie.action.hadoop.ShellMain.main(ShellMain.java:63) ... 16 more Failing Oozie Launcher, Main Class [org.apache.oozie.action.hadoop.ShellMain], exit code [1] Oozie Launcher, uploading action data to HDFS sequence file: Stopping AM
... View more
Labels:
- Labels:
-
Apache Oozie
06-20-2022
08:04 AM
Hi, I run a sqoop import in order to fetch data from a table in sql server. Inside the sqoop I have a query which fetches every 6 mins data from 2 hours before until now. The weird thing is that sqoop doesnt fetch all the data. It is somehow random how many data it fetches. My sqoop command is the below. For example in this 2 hours scanning I fetched a record 28 times. 19 times all the rows were align with the sql server but 9 they were half sqoop import --connect 'jdbc:sqlserver\ --username \ --password-alias \ --num-mappers 10 \ --split-by an_int \ --fields-terminated-by '|' \ --query "select * from table where timestamp > '${offset}' and \$CONDITIONS" \ --delete-target-dir \ --target-dir The amount of the data for 2 hours is ~800k
... View more
Labels:
- Labels:
-
Apache Sqoop