I have Hadoop cluster with 6 datanode and 1 namenode. I have few(4) jobs in HIVE which run on every day and push some data from logfile to our OLPT data base using sqoop. I do not have oozie installed in the environment. All are written in HIVE script file (.sql file) and I run those from unix script(.sh file). Those shell script file are attach with different OS cron job to run those on different time.
Now Requirement is This:
Generate log/status for each job separately on daily basis. So that at the end of the day looking into those log we can identify which job run successfully and time it took to run , which job failed and dump/stack stace for that failed job.(Feature plan is that we will have mail server and every failed or success job shell script will send mail to respective stack holder with those log/status file as attachment)
Now my problem is how I can find error/exception if anything I have to run those batch job / shell script and how to generate success log also with execution time?
I tried to get the output in text file for each query run into HIVE by redirecting the output but that is not working.
Select* from staging_table;>>output.txt
Is there any way to do this by configuring HIVE log for each and every HIVE job on day to day basis?
Please let me know if any one face this issue and how can I resolve this...
Thanks in advance.
1. I would use Oozie.
2. Try using webhcat if Oozie is not an option, that will give you a jobID response in JSON format, then you can trace the status.
3. Enable HA as if your server with cron dies, your jobs have no means of running, hence see #1.
4. You can execute your commands like so
hive -f file.hql 2>&1 | tee -a logname.log