Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

How to get logs of shell scripts in oozie

Explorer

I have a shell script in HDFS. I want to schedule this script in oozie.

 

     #!/bin/bash
     LOG_LOCATION=/home/$USER/logs
     exec 2>&1

     [ $# -ne 1 ] && { echo "Usage : $0 table ";exit 1; }

 

     table=$1

 

     TIMESTAMP=`date "+%Y-%m-%d"`
     touch /home/$USER/logs/${TIMESTAMP}.success_log
     touch /home/$USER/logs/${TIMESTAMP}.fail_log
     success_logs=/home/$USER/logs/${TIMESTAMP}.success_log
     failed_logs=/home/$USER/logs/${TIMESTAMP}.fail_log

   

     #Function to get the status of the job creation
     function log_status
     {
     status=$1
     message=$2
     if [ "$status" -ne 0 ]; then
     echo "`date +\"%Y-%m-%d %H:%M:%S\"` [ERROR] $message [Status] $status : failed" | tee -a      "${failed_logs}"
     #echo "Please find the attached log file for more details"
    exit 1
    else
    echo "`date +\"%Y-%m-%d %H:%M:%S\"` [INFO] $message [Status] $status : success" | tee -a    "${success_logs}"
    fi
    }

 

    `hive -e "create table testing.${table} as select * from database.${table}"`

 

    g_STATUS=$?
    log_status $g_STATUS "Hive create ${table}"

    *******************************************************************************************************************************

 

I have a some questions regarding using oozie to schedule shell scripts.

 

1) In my script I have failed and success logs which give me the result of the script whether it is successful or failed. Can we have this kind of feature in HDFS also while using oozie?

 

2) IN the script I am also collecting the stdout logs as you can see in the 2nd and 3rd lines after the shebang in my script. Can this also be achieved in HDFS?

 

If so how can we achieve these both in `HDFS` while scheduling shell scripts using oozie.

 

Could anyone explain please

 

If there are bettere ways to do things in oozie please let me know

1 ACCEPTED SOLUTION

Champion
that is correct. save all logs to /tmp and then upload to HDFS

View solution in original post

14 REPLIES 14

Champion
Your best bet is to write out to a temporary location on the local FS and then upload them to HDFS at the end. It is wonky but the best way to do this with a bash script scheduled through Oozie. Be careful to keep this in check as it will likely generate a lot of small files.

Explorer

@mbigelow As you can see in my script I am using touch command to create files in Linux but when I schedule the script in oozie, It throws out error touch cannot create file or directory.

 

I don't know why this is happening

Champion
I can't say for certain but it is probably because it is running the script on whatever node the container ends up running on. So if /home/mbigelow/logs/20170412... doesn't exist on each and every node then it will fail. I tend to stick with /tmp as it is always there and writable for all users.

Explorer

@mbigelow so If i save the logs to /tmp/sanje so I will be able collect them irrespective of the node it runs. Is this correct.

 

What about these:

 

LOG_LOCATION=/home/$USER/logs
exec 2>&1

 

These also should be on the /tmp folder is this correct

 

and then move them to HDFS

 

Champion
that is correct. save all logs to /tmp and then upload to HDFS

Explorer

@mbigelow I have tried like below

 

mkdir -p /tmp/$USER/logs

 touch /tmp/$USER/logs/${TIMESTAMP}.success_log

 

But in the /tmp folder I don't see any folder called logs and cannot find the file.

 

but when I go to /tmp folder on the edgenode I can create files and directories.

 

Please advise where the problem is occuring

Super Collaborator

Did you check directly onto the specific data-node that ran the action ?

Only that node will have your log "localy"

Explorer
How can i do that?

Super Collaborator

How can you do what ?

Which specific part is blocking you ?

Explorer
I mean how can we check the folders on datanodes. I don't know how to do it

Super Collaborator

Well, one way to do it would be to connect yourself in SSH to the datanodes.

For exemple, using putty or winscp.

 

 

Explorer

@mathieu.d Here I am creating files in /tmp folder on datanodes right.

 

Can I do hdfs dfs -put /tmp/$USER/...success.log /user/$USER/logs/...success.log

 

Will this work ? 

Explorer

@mathieu.d @mbigelow Thank you both of you I was able to achieve the desired result.

 

First stored the logs in local and then uploaded them to HDFS

New Contributor

how did you stored logs in local? can you please provide me the way you did it.