Member since
01-12-2016
123
Posts
12
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1531 | 12-12-2016 08:59 AM |
03-02-2017
02:23 PM
my input file is below a.txt
aaa.kyl,data,data
bbb.kkk,data,data
cccccc.hj,data,data
qa.dff,data,data
A = LOAD '/pigdata/a.txt' USING PigStorage(',') AS(a1:chararray,a2:chararray,a3:chararray); How to resolve below error and what is the reason for this error ERROR:-
C = FOREACH A GENERATE STRSPLIT(a1,'\\u002E') as (a1:chararray, a1of1:chararray),a2,a3;
2017-02-03 00:45:42,803 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1031: Incompatable schema: left is "a1:chararray,a1of1:chararray", right is ":tuple()"
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Pig
01-23-2017
04:04 PM
base_table is not a exernal table.How we are loading the data into base_table is not clear during first run is not clear?could you please provide input on this?we are not using any load statement or insert into statement? I think after using below statement.Manually we have to load the data from files present in path:-
/user/hive/incremental_table/incremental_table into table base_table sqoop import --connect jdbc:teradata://{host name or ip address}/Database=retail --connection-manager org.apache.sqoop.teradata.TeradataConnManager --username dbc --password dbc --table SOURCE_TBL --target-dir /user/hive/incremental_table -m 1
... View more
01-13-2017
03:51 PM
a)Thanks for input @Artem Ervits .Your input is always appreciated. I will go for coordinator Job with time and data availability-based scheduling but still have following clarifications clarification 1:- suppose if i am using below command to trigger the coordinator job.Is it one time activity in production to run this command once in production since it will trigger based on frequency for day 2?please correct me if i am wrong or do i need to run this command on day2 also?
oozie job -oozie http://sandbox.hortonworks.com:11000/oozie -config /path/to/job.properties -run
<coordinator-app name="my_first_job" start="2014-01-01T02:00Z"
end="2014-12-31T02:00Z" frequency="${coord:days(1)}"
xmlns="uri:oozie:coordinator:0.4">
clarification 2:-How to Implement condition logic in your Oozie workflow and if there's new data, run the actions, otherwise proceed to end action?
... View more
01-12-2017
09:43 AM
HI @Santhosh B Gowda Thanks for input. a)My question is how to run below command in production since we should not run manually. oozie job --oozie http://host_nameofoozieserver:8080/oozie -Doozie.wf.application.path=hdfs://namenodepath/pathof_workflow_xml/workflow.xml-run
b)I know about coordinator but at this point of time i am not sure whether i have to use data or time triggers. currently we are running flume continusoly
... View more
01-12-2017
08:57 AM
Currently we are using oozie workflow(consists of hive,pig,sqoop actions) using below command in dev environment. In Production environment we should not run manually.can I create a shell script for below command and I can run that shell script using crontab scheduler.Is my approach is correct if yes what is the timings for the script. If not what is the approach to run below command in Production? oozie job --oozie http://host_nameofoozieserver:8080/oozie -D
oozie.wf.application.path=hdfs://namenodepath/pathof_workflow_xml/workflow.xml-run
... View more
Labels:
- Labels:
-
Apache Oozie
01-07-2017
04:25 PM
Hi @Mahesh Mallikarjunappa what is the purpose of test.hql? logic behind it?
... View more
01-05-2017
09:11 AM
Let us say i have below relation x.I can filter {} these records using size==0 or IsEmpty($1) Urman,{(100)}
Gietz,{()}
LAST_NAME,{}
clarification:-
How to filter records with {()}.I mean i need only Gietz,{()} from relation x.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Pig
01-03-2017
12:19 PM
HI @Greg Keys Could you please provide input on my clarification
... View more