Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar
Contributor

Log Search is a log analysis / monitoring tool which is shipped with Ambari. Log Search has 2 components: Log Search Portal (server + web) and Log Feeder. The second one is responsible to monitor specific log files and ship the parsed log lines into Solr.

To define which files should be monitored and how the parsing should work for those files, you will need input config descriptors (list of the input config descriptors are located in /etc/ambari-logsearch-logfeeder/conf/logfeeder.properties, all of them are defined in logfeeder.config.files configuration property, on ambari managed Log Search service, those can be found in logfeeder-properties/logfeeder.config.files configuration entry)

If you have a specific custom service (see: https://cwiki.apache.org/confluence/display/AMBARI/Custom+Services), to support that inside Log Search application you will need a *-logsearch-conf.xml (* can be a custom name, Ambari will generate input.config-*.json file based on the name inside /etc/ambari-logsearch-logfeeder/conf/) file inside the {SERVICE_NAME}/{SERVICE_VERSION}/configuration folder, this *-logsearch-conf.xml should contain 3 properties:

- service_name

- component_mappings

- content

Here is an example for that (zookeeper):

<configuration supports_final="false" supports_adding_forbidden="true">  
  <property>    
    <name>service_name</name>
    <display-name>Service name</display-name>    
    <description>Service name for Logsearch Portal (label)</description>
    <value>Zookeeper</value>
    <on-ambari-upgrade add="true"/>  
  </property>  
  <property>
    <name>component_mappings</name>
    <display-name>Component mapping</display-name>
    <description>Logsearch component logid mapping list (e.g.: COMPONENT1:logid1,logid2;COMPONENT2:logid3)</description>
   <value>ZOOKEEPER_SERVER:zookeeper</value>
   <on-ambari-upgrade add="true"/>
 </property>
 <property>
   <name>content</name>
   <display-name>Logfeeder Config</display-name>
   <description>Metadata jinja template for Logfeeder which contains grok patterns for reading service specific logs.</description>
  <value>{  "input":[    
           {     "type":"zookeeper",     
                 "rowtype":"service",
                 "path":"{{default('/configurations/zookeeper-env/zk_log_dir', '/var/log/zookeeper')}}/zookeeper*.log"}  ],
            "filter":[   {
               "filter":"grok",
               "conditions":{
                 "fields":{"type":["zookeeper"]}
               }, 
               "log4j_format":"%d{ISO8601} - %-5p [%t:%C{1}@%L] - %m%n",     "multiline_pattern":"^(%{TIMESTAMP_ISO8601:logtime})", 
               "message_pattern":"(?m)^%{TIMESTAMP_ISO8601:logtime}%{SPACE}-%{SPACE}%{LOGLEVEL:level}%{SPACE}\\[%{DATA:thread_name}\\@%{INT:line_number}\\]%{SPACE}-%{SPACE}%{GREEDYDATA:log_message}",
              "post_map_values": { 
                 "logtime": {
                    "map_date":{
                       "target_date_pattern":"yyyy-MM-dd HH:mm:ss,SSS"         
                    }       
                  }     
               }    
          }   
      ]}    
    </value>
    <value-attributes>
      <type>content</type>
      <show-property-name>false</show-property-name>
    </value-attributes>
    <on-ambari-upgrade add="true"/>
  </property>
</configuration>

The first property is the service_name, that will be the label for the custom service inside Log Search, which will appear on the troubleshooting page.

The second one is the component_mappings, that's important because of 2 reasons: if you will click on the custom service label on Log Search portal, it will choose the proper components for filtering, the other reason is that we need to map the specific logIds (those are defined in the service descriptors) to Ambari components (as you see Ambari components names are different from Log Search names: ZOOKEEPER_SERVER <-> zookeeper_server).

It can be multiple logIds for components, as its possible, for a specific component there are multiple log files that is needed to monitor.

The last property is the content, which is a template that will be generated during logfeeder startup (also that means you will need to restart the Log Feeders if you just added a your new service to the cluster with the proper *-logsearch-conf configuration). First thing you need here is an "input", which describe the log file(s) that is monitored by the Log Feeder. ("rowtype" should be service, type is the logId, path is the log location pattern, can be used regex there ... as you can see there is a python code used in the path, that is used to get the zookeeper log directory from the ambari configuration, that can be important in case of the log directory changes). The second important block is the "filter" part. there you will need to chose "grok" (there is a "json" one as well, but that only works on that case if you have the logsearch-log4j-appender in your classpath). With Grok you can describe, how the log lines should be parsed, and what fields will be mapped to specific solr fields. (2 important fields here: multiline_pattern - if this pattern matches, that means the actual line will be appended to the last one (log_message), message_pattern: that will define how to parse the specific fields and maps them to Solr field, here logtime and log_message are required, level is optional, but recommended).

After the parsing has done, you can modify the mappings with post_map_values (as you see in the example, we re-map the date to use a specific pattern in order to save dates in a specific format inside Solr)

(for more details about input configurations see: https://github.com/apache/ambari/blob/trunk/ambari-logsearch/ambari-logsearch-logfeeder/docs/inputCo...

For figure it out what is the proper pattern to use to your log files, you can use: https://grokdebug.herokuapp.com/

There are some built-in grok patterns used for Log Search, you can find those here: https://github.com/apache/ambari/blob/trunk/ambari-logsearch/ambari-logsearch-logfeeder/src/main/res... - that can be included to the debugging tool if you click on "Add custom patterns".

4,852 Views
Comments
avatar
New Contributor

Hi, oszabo. I met some problems when I tried to add hawq service to Log Search.

1. I added the hawq-logsearch-conf.xml file to /var/lib/ambari-server/resources/common-services/HAWQ/2.0.0/configuration/. But after I finished "Add Service", the Log Search didn't show to restart. Did I missed something after I added the *-logsearch-conf.xml?

2. Should I change the logfeeder.config.files configuration entry to add input.config-hawq.json?

avatar
New Contributor

Hi,
Thanks for the details. It has been really useful.

How is the best way to implement the required restart in Log Search after adding a new service to Log Search or after changing some of the *-logsearch-conf.xml configurations for the new third party service in the UI?

I can see it is working for HDP stack services. For instance changes in zookeeper-logsearch-conf.xml forces requires Log Search to restart.
I dont see <config-dependecies> in the metainfo.xml of log search is used.

Maybe it is happening in the UI .js code but additional services we can't hack that code.

Thanks

avatar
New Contributor

After conversation with Hortonworks developers working on LogSearch. Two conclusions:
1. For additional 3rd party services nothing enforce their restart in Ambari. It is not part of the 2.5.x current design. The Log Feeders should be restarted manually.

2. The integration of services in Ambari Log Search for 3.0 will be slightly different. Services wont include *-logsearch-conf.xml configuration file anymore. They should include file a called input.config-<service_name>.json.j2 at package/templates in their stack definition, instead.

This template wont be editable as an Ambari property. The template will be handled by Ambari internally (post install hook) to create a json file which will be deployed to the Log Feeders for the first time.