Support Questions

Find answers, ask questions, and share your expertise

How to get logs of shell scripts in oozie

avatar
Contributor

I have a shell script in HDFS. I want to schedule this script in oozie.

 

     #!/bin/bash
     LOG_LOCATION=/home/$USER/logs
     exec 2>&1

     [ $# -ne 1 ] && { echo "Usage : $0 table ";exit 1; }

 

     table=$1

 

     TIMESTAMP=`date "+%Y-%m-%d"`
     touch /home/$USER/logs/${TIMESTAMP}.success_log
     touch /home/$USER/logs/${TIMESTAMP}.fail_log
     success_logs=/home/$USER/logs/${TIMESTAMP}.success_log
     failed_logs=/home/$USER/logs/${TIMESTAMP}.fail_log

   

     #Function to get the status of the job creation
     function log_status
     {
     status=$1
     message=$2
     if [ "$status" -ne 0 ]; then
     echo "`date +\"%Y-%m-%d %H:%M:%S\"` [ERROR] $message [Status] $status : failed" | tee -a      "${failed_logs}"
     #echo "Please find the attached log file for more details"
    exit 1
    else
    echo "`date +\"%Y-%m-%d %H:%M:%S\"` [INFO] $message [Status] $status : success" | tee -a    "${success_logs}"
    fi
    }

 

    `hive -e "create table testing.${table} as select * from database.${table}"`

 

    g_STATUS=$?
    log_status $g_STATUS "Hive create ${table}"

    *******************************************************************************************************************************

 

I have a some questions regarding using oozie to schedule shell scripts.

 

1) In my script I have failed and success logs which give me the result of the script whether it is successful or failed. Can we have this kind of feature in HDFS also while using oozie?

 

2) IN the script I am also collecting the stdout logs as you can see in the 2nd and 3rd lines after the shebang in my script. Can this also be achieved in HDFS?

 

If so how can we achieve these both in `HDFS` while scheduling shell scripts using oozie.

 

Could anyone explain please

 

If there are bettere ways to do things in oozie please let me know

1 ACCEPTED SOLUTION

avatar
Champion
that is correct. save all logs to /tmp and then upload to HDFS

View solution in original post

14 REPLIES 14

avatar
Champion
Your best bet is to write out to a temporary location on the local FS and then upload them to HDFS at the end. It is wonky but the best way to do this with a bash script scheduled through Oozie. Be careful to keep this in check as it will likely generate a lot of small files.

avatar
Contributor

@mbigelow As you can see in my script I am using touch command to create files in Linux but when I schedule the script in oozie, It throws out error touch cannot create file or directory.

 

I don't know why this is happening

avatar
Champion
I can't say for certain but it is probably because it is running the script on whatever node the container ends up running on. So if /home/mbigelow/logs/20170412... doesn't exist on each and every node then it will fail. I tend to stick with /tmp as it is always there and writable for all users.

avatar
Contributor

@mbigelow so If i save the logs to /tmp/sanje so I will be able collect them irrespective of the node it runs. Is this correct.

 

What about these:

 

LOG_LOCATION=/home/$USER/logs
exec 2>&1

 

These also should be on the /tmp folder is this correct

 

and then move them to HDFS

 

avatar
Champion
that is correct. save all logs to /tmp and then upload to HDFS

avatar
Contributor

@mbigelow I have tried like below

 

mkdir -p /tmp/$USER/logs

 touch /tmp/$USER/logs/${TIMESTAMP}.success_log

 

But in the /tmp folder I don't see any folder called logs and cannot find the file.

 

but when I go to /tmp folder on the edgenode I can create files and directories.

 

Please advise where the problem is occuring

avatar
Super Collaborator

Did you check directly onto the specific data-node that ran the action ?

Only that node will have your log "localy"

avatar
Contributor
How can i do that?

avatar
Super Collaborator

How can you do what ?

Which specific part is blocking you ?