Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Where are the logfiles for Spark2 Executors?

Where are the logfiles for Spark2 Executors?

Explorer

I have a very simple Spark Stream processing application and can't figure out where the log messages are. Here is the code

 

def createContext(loggerName):
    sc = SparkContext(appName="Compute Streaming Stats")

    sc.setLogLevel("INFO")

    log4jLogger = sc._jvm.org.apache.log4j
    scLogger = log4jLogger.LogManager.getLogger(loggerName)

    # stream for 180 seconds
    ssc = StreamingContext(sc, 180)

    kafkaStream = KafkaUtils.createStream(ssc, \
    "my.server:2181", \
    "compute-streaming-stats",\
    {"test":1})

    parsed = kafkaStream.map(lambda v: json.loads(v[1]))

    # Count number of tweets in the batch
    count_this_batch = kafkaStream.count()

    scLogger.info(count_this_batch)
    scLogger.info(parsed)
    
    return ssc

#end createContext
    
if __name__ == "__main__":

    ## Log setup
    loggerName = __name__
    dtmStamp = datetime.now().strftime('%Y_%m_%d_%H_%M_%S')
    logPath='/home/<username>/logs/' + os.path.basename(__file__) + '-' + dtmStamp + '.log'

    logger = logging.getLogger(loggerName)
    logger.setLevel(logging.INFO)

    # create a file handler
    handler = logging.FileHandler(logPath)
    handler.setLevel(logging.INFO)

    # create a logging format
    formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
    handler.setFormatter(formatter)

    # add the handlers to the logger
    logger.addHandler(handler)

    ssc = StreamingContext.getOrCreate('/home/<username>/tmp/checkpoint_v0', lambda: createContext(loggerName))
    ssc.start()
    ssc.awaitTermination()

#end main if

SO where is the "counts_this_batch" value written to? where us scLogger writing to?

 

YARN is set to aggregate the logs. I have looked in the folders set in the YARN Log DIR property.

 

I have also searched for it in

 

yarn logs --applicationId application_12323123213

 

Any leads hightly appreciated.

4 REPLIES 4

Re: Where are the logfiles for Spark2 Executors?

Expert Contributor

I have not got a chance to run your code locally, but I believe it should be where you've defined your logpath to be

 

logPath='/home/<username>/logs/' + os.path.basename(__file__) + '-' + dtmStamp + '.log'

 

Did you forgot to replace the <username> with the actual username or was it redacted for sharing purpose?

 

Re: Where are the logfiles for Spark2 Executors?

Explorer

Ah, it's not as simple as it appears. I added <username> explicitly to obfuscate my name. Spark executes the code on several nodes and I want to know where the log messages are written to. It's definetly not in the log file I specified.

 

Hope that helps clarify.

Re: Where are the logfiles for Spark2 Executors?

Expert Contributor

I see, Thanks. Are you able to print the results on the console using a simple spark kafka streaming app https://www.cloudera.com/documentation/enterprise/5-8-x/topics/spark_streaming.html ? If yes, we'd need to look at why the logging part is not working.

Re: Where are the logfiles for Spark2 Executors?

Explorer

Yes pprint works fine. But I want to log the messages to a log file.


@AutoIN wrote:

I see, Thanks. Are you able to print the results on the console using a simple spark kafka streaming app https://www.cloudera.com/documentation/enterprise/5-8-x/topics/spark_streaming.html ? If yes, we'd need to look at why the logging part is not working.