Log Analysis Language

Log Analysis Language (LAL) in SkyWalking is essentially a Domain-Specific Language (DSL) to analyze logs. You can use LAL to parse, extract, and save the logs, as well as collaborate the logs with traces (by extracting the trace ID, segment ID and span ID) and metrics (by generating metrics from the logs and sending them to the meter system).

The LAL config files are in YAML format, and are located under directory lal. You can set log-analyzer/default/lalFiles in the application.yml file or set environment variable SW_LOG_LAL_FILES to activate specific LAL config files.

OTLP log attribute mapping

When logs arrive via the OTLP receiver, resource attributes are mapped to LogData fields:

Resource attribute	LogData field	Notes
`service.name`	`service`	SkyWalking service name
`service.instance.id`	`serviceInstance`	OTel standard (spec). Falls back to `service.instance` for backward compatibility.
`service.layer`	`layer`	Routes to the LAL rule with matching `layer` declaration

Log record attributes are available via tag("attribute_name") in LAL rules. Attribute keys retain their original names (dots are NOT converted to underscores in log attributes).

All OTLP resource attributes are also available via sourceAttribute("attribute_name") in LAL rules. Unlike tag() which reads from persistent log record tags (tagsRawData), sourceAttribute() reads from non-persistent source context — the values are available during LAL processing but are NOT stored. Use tag 'key': sourceAttribute("attr") in the extractor to selectively persist specific attributes.

Layer

Layer should be declared in the LAL script to represent the analysis scope of the logs. LAL rules are routed by layer — only rules matching the incoming log’s layer are evaluated.

Inline layer declarations (`layerDefinitions:`)

A LAL file may declare its own custom layers with a top-level layerDefinitions: block. Each entry is funneled through Layer.register(name, ordinal, normal) before the rules in the same file compile, so a LAL file is fully self-describing — a new monitoring target can land as a single LAL file without an enum edit elsewhere in the OAP source.

layerDefinitions:
  - name: IOT_FLEET    # upper-snake-case, must match [A-Z][A-Z0-9_]*
    ordinal: 1000      # unique across all layers; >= 1000 recommended
    normal: true       # true = agent-installed (default), false = conjectured/virtual

rules:
  - name: iot-fleet-access
    layer: IOT_FLEET
    dsl: |
      filter {
        text { regexp $/(?<status>\d+)\s+(?<path>\S+)/$ }
        sink { sampler { rateLimit { rpm 1800 } } }
      }

Notes:

Storage encoding is the ordinal int, persisted in BanyanDB / Elasticsearch / JDBC. Every OAP node that reads or writes a given layer must agree on its (name, ordinal) mapping — deploy a LAL file with layerDefinitions: identically across all nodes.
Identical re-registration is a no-op, so the same IOT_FLEET entry can appear in multiple LAL files (and additionally in a MAL file, in layer-extensions.yml, or via the LayerExtension SPI). Conflicting registrations cause OAP boot to fail loudly with the offending file in the stack trace.
Ordinals 0–50 are in active use by the OAP distribution’s built-in layers; 51–999 are reserved by convention for future built-ins. External layers should start at >= 1000 — enforcement is not strict, but staying above the reserved band avoids upgrade-time collisions.
layer: auto works with extension layers too — the extractor body can call layer "IOT_FLEET" and the runtime resolves it through the registry.

Three other registration paths exist for layers that are not specific to a LAL file: an operator-managed layer-extensions.yml, a LayerExtension Java SPI for plugin jars, and the built-in static fields in Layer.java for distribution layers. See Layer.java javadoc for the full picture.

When layer: auto is declared, the rule matches logs where service.layer is absent (common for OTLP sources that don’t set this attribute). The script is expected to set the layer in the extractor:

rules:
  - name: detect-ios
    layer: auto
    dsl: |
      filter {
        if (sourceAttribute("os.name") != "iOS" && sourceAttribute("os.name") != "iPadOS") {
          abort {}
        }
        extractor {
          layer "IOS"
          instance sourceAttribute("service.version")
          tag 'device.model': sourceAttribute("device.model.identifier")
        }
        sink {}
      }

In layer: auto mode, abort and drop have distinct meanings:

abort {} — “not mine, let others try.” If all auto rules abort, the log falls back to GENERAL layer processing automatically.
drop {} — “mine, but sampled out.” The rule claimed the log (no GENERAL fallback) but chose not to persist it (e.g., sampling). Use this for rate limiting or conditional dropping within a recognized log type.

If an auto rule claims the log (doesn’t abort) but doesn’t set a layer, the log is warned and dropped at persistence time.

Filter

A filter is a group of parser, extractor and sink. Users can use one or more filters to organize their processing logic. Every piece of log will be sent to all filters in an LAL rule. A piece of log sent to the filter is available as property log in the LAL, therefore you can access the log service name via log.service. For all available fields of log, please refer to the protocol definition.

All components are executed sequentially in the orders they are declared.

Global Functions

Globally available functions may be used them in all components (i.e. parsers, extractors, and sinks) where necessary.

abort

By default, all components declared are executed no matter what flags (dropped, saved, etc.) have been set. There are cases where you may want the filter chain to stop earlier when specified conditions are met. abort function aborts the remaining filter chain from where it’s declared, and all the remaining components won’t be executed at all. abort function serves as a fast-fail mechanism in LAL.

filter {
    if (log.service == "TestingService") { // Don't waste resources on TestingServices
        abort {} // all remaining components won't be executed at all
    }
    // ... parsers, extractors, sinks
}

Note that when you put regexp in an if statement, you need to surround the expression with () like regexp(<the expression>), instead of regexp <the expression>.

tag

tag function provide a convenient way to get the value of a tag key.

We can add tags like following:

[
   {
      "tags":{
         "data":[
            {
               "key":"TEST_KEY",
               "value":"TEST_VALUE"
            }
         ]
      },
      "body":{
         ...
      }
      ...
   }
]

And we can use this method to get the value of the tag key TEST_KEY.

filter {
    if (tag("TEST_KEY") == "TEST_VALUE") {
         ...
    }
}

sourceAttribute

sourceAttribute function provides access to non-persistent source context attributes. For OTLP logs, these are the resource attributes (e.g., os.name, device.model.identifier). Unlike tag(), these values are NOT persisted in tagsRawData — they are only available during LAL processing.

filter {
    if (sourceAttribute("os.name") == "iOS") {
        extractor {
            layer "IOS"
            tag 'device.model': sourceAttribute("device.model.identifier")
        }
    }
    sink {}
}

Returns empty string if the key is not found.

Parser

Parsers are responsible for parsing the raw logs into structured data in SkyWalking for further processing. There are 3 types of parsers at the moment, namely json, yaml, and text.

When a piece of log is parsed, there is a corresponding property available, called parsed, injected by LAL. Property parsed is typically a map, containing all the fields parsed from the raw logs. For example, if the parser is json / yaml, parsed is a map containing all the key-values in the json / yaml; if the parser is text , parsed is a map containing all the captured groups and their values (for regexp and grok).

All parsers share the following options:

Option	Type	Description	Default Value
`abortOnFailure`	`boolean`	Whether the filter chain should abort if the parser failed to parse / match the logs	`true`

See examples below.

`json`

filter {
    json {
        abortOnFailure true // this is optional because it's default behaviour
    }
}

json reads the JSON body of the native log protocol. When the JSON body is empty but a plain-text body is present — for example, the OTLP log receiver delivers every string body as text, even JSON-shaped ones — the parser tries the text body as JSON instead. A parse failure of either form follows abortOnFailure.

When json parses successfully from a text body, the log is normalized to a JSON body for the matching rule: the rule persists it with content type JSON and log.body reads the JSON content within that rule. This makes a JSON-shaped log delivered over any transport persist and render as JSON once a json {} rule matches it. The normalization is scoped to the matching rule — other rules analyzing the same log still see the original text body.

A parse failure that aborts the log is reported at WARN, rate-limited to one report per minute per parser with the number of suppressed failures included in the next report. With abortOnFailure false, a failure is expected control flow: it is only logged at DEBUG, the log continues through the filter chain, and parsed.* reads return the metadata fallback fields (or null).

`yaml`

filter {
    yaml {
        abortOnFailure true // this is optional because it's default behaviour
    }
}

`text`

For unstructured logs, there are some text parsers for use.

regexp

regexp parser uses a regular expression (regexp) to parse the logs. It leverages the captured groups of the regexp, all the captured groups can be used later in the extractors or sinks. regexp returns a boolean indicating whether the log matches the pattern or not.

filter {
    text {
        abortOnFailure true // this is optional because it's default behaviour
        // this is just a demo pattern
        regexp "(?<timestamp>\\d{8}) (?<thread>\\w+) (?<level>\\w+) (?<traceId>\\w+) (?<msg>.+)"
    }
    extractor {
        tag level: parsed.level
        // we add a tag called `level` and its value is parsed.level, captured from the regexp above
        traceId parsed.traceId
        // we also extract the trace id from the parsed result, which will be used to associate the log with the trace
    }
    // ...
}

Extractor

Extractors aim to extract metadata from the logs. The metadata can be a service name, a service instance name, an endpoint name, or even a trace ID, all of which can be associated with the existing traces and metrics.

Local variables (`def`)

You can use def to declare local variables in the extractor (or at the filter level). This is useful when an expression is reused multiple times, or when you want to break a long chain into readable steps.

The syntax is:

def variableName = expression
def variableName = expression as TypeName

The variable type is inferred from the initializer expression at compile time. def is not limited to JSON — it works with any value access expression whose type is resolvable on the classpath, including protobuf getter chains, log.* fields, and Gson JSON method chains. Subsequent method calls on the variable are validated at compile time against the inferred type.

You can optionally add an explicit as type cast to narrow the variable type. The cast type can be a built-in type (String, Long, Integer, Boolean) or a fully qualified class name:

def value = someExpression as com.example.MyType

This is useful when the compiler infers a type that is too general (e.g., Object from a generic API return) and you know the concrete runtime type. The cast tells the compiler which type to use for subsequent method chain validation. Note that as performs a Java cast — it does not convert between types. For JSON conversion, use toJson() or toJsonArray() instead.

The FQCN must be resolvable on the classpath at compile time. If the class is not found, the OAP server will fail to start.

Two built-in conversion functions are provided for JSON interoperability:

toJson(expr) — converts a value to a Gson JsonObject. Works with JSON strings, Map, and protobuf Struct.
toJsonArray(expr) — converts a value to a Gson JsonArray. Works with JSON array strings.

After declaration, the variable can be used in subsequent expressions with full null-safe navigation support (?.).

Def variables can also be used as method arguments. This is useful when you need to look up a dynamic key:

filter {
    json {}
    extractor {
        def key = parsed.fieldName as String
        def config = toJson(parsed.metadata)
        tag 'val': config?.get(key)?.getAsString()
    }
    sink {}
}

Example — extracting fields from a protobuf input type (no JSON conversion needed):

filter {
    extractor {
        def resp = parsed?.response
        tag 'status.code': resp?.responseCode?.value
        tag 'resp.flags': resp?.responseCodeDetails
    }
    sink {}
}

Example — extracting JWT claims from envoy access log filter metadata via toJson():

filter {
    extractor {
        def jwt = toJson(parsed?.commonProperties?.metadata
            ?.filterMetadataMap?.get("envoy.filters.http.jwt_authn"))
        def payload = jwt?.getAsJsonObject("payload")
        if (payload != null) {
            tag 'email': payload?.get("email")?.getAsString()
            tag 'group': payload?.get("group")?.getAsString()
        }
    }
    sink {}
}

Example — parsing a JSON log body field into a structured object:

filter {
    json {}
    extractor {
        def config = toJson(parsed.metadata)
        tag 'env': config?.get("env")?.getAsString()
        tag 'region': config?.getAsJsonObject("location")?.get("region")?.getAsString()
    }
    sink {}
}

Standard fields

service

service extracts the service name from the parsed result, and set it into the LogData, which will be persisted (if not dropped) and is used to associate with traces / metrics.

instance

instance extracts the service instance name from the parsed result, and set it into the LogData, which will be persisted (if not dropped) and is used to associate with traces / metrics.

endpoint

endpoint extracts the endpoint name from the parsed result, and set it into the LogData, which will be persisted (if not dropped) and is used to associate with traces / metrics.

traceId

traceId extracts the trace ID from the parsed result, and set it into the LogData, which will be persisted (if not dropped) and is used to associate with traces / metrics.

segmentId

segmentId extracts the segment ID from the parsed result, and set it into the LogData, which will be persisted (if not dropped) and is used to associate with traces / metrics.

spanId

spanId extracts the span ID from the parsed result, and set it into the LogData, which will be persisted (if not dropped) and is used to associate with traces / metrics.

timestamp

timestamp extracts the timestamp from the parsed result, and set it into the LogData, which will be persisted (if not dropped) and is used to associate with traces / metrics.

The parameter of timestamp can be a millisecond:

filter {
    // ... parser

    extractor {
        timestamp parsed.time as String
    }
}

or a datetime string with a specified pattern:

filter {
    // ... parser

    extractor {
        timestamp parsed.time as String, "yyyy-MM-dd HH:mm:ss"
    }
}

layer

layer extracts the layer from the parsed result, and set it into the LogData, which will be persisted (if not dropped) and is used to associate with service.

tag

tag extracts the tags from the parsed result, and set them into the LogData. The form of this extractor should look something like this: tag key1: value, key2: value2. You may use the properties of parsed as both keys and values.

filter {
    // ... parser

    extractor {
        tag level: parsed.level, (parsed.statusCode): parsed.statusMsg
        tag anotherKey: "anotherConstantValue"
        layer 'GENERAL'
    }
}

Output fields

When a rule declares a custom outputType (see Output Type), the extractor can set fields specific to that output type. Any identifier in the extractor that is not a standard field (listed above) is treated as an output field assignment. The syntax is the same as standard fields:

fieldName valueExpression as Type

The LAL compiler validates at boot time that a matching setter exists on the output type class (e.g., setStatement(String) for field statement). If no setter is found, the OAP server will fail to start, ensuring early error detection.

See Slow SQL and Sampled Trace for concrete examples.

`metrics`

metrics extracts / generates metrics from the logs, and sends the generated metrics to the meter system. You may configure MAL for further analysis of these metrics. The dedicated MAL config files are under directory log-mal-rules, and you can set log-analyzer/default/malFiles to enable configured files.

# application.yml
# ...
log-analyzer:
  selector: ${SW_LOG_ANALYZER:default}
  default:
    lalFiles: ${SW_LOG_LAL_FILES:my-lal-config} # files are under "lal" directory
    malFiles: ${SW_LOG_MAL_FILES:my-lal-mal-config, folder1/another-lal-mal-config, folder2/*} # files are under "log-mal-rules" directory

Examples are as follows:

filter {
    // ...
    extractor {
        service parsed.serviceName
        metrics {
            name "log_count"
            timestamp parsed.timestamp
            labels level: parsed.level, service: parsed.service, instance: parsed.instance
            value 1
        }
        metrics {
            name "http_response_time"
            timestamp parsed.timestamp
            labels status_code: parsed.statusCode, service: parsed.service, instance: parsed.instance
            value parsed.duration
        }
    }
    // ...
}

The extractor above generates a metrics named log_count, with tag key level and value 1. After that, you can configure MAL rules to calculate the log count grouping by logging level like this:

# ... other configurations of MAL

metrics:
  - name: log_count_debug
    exp: log_count.tagEqual('level', 'DEBUG').sum(['service', 'instance']).increase('PT1M')
  - name: log_count_error
    exp: log_count.tagEqual('level', 'ERROR').sum(['service', 'instance']).increase('PT1M')

The other metrics generated is http_response_time, so you can configure MAL rules to generate more useful metrics like percentiles.

# ... other configurations of MAL

metrics:
  - name: response_time_percentile
    exp: http_response_time.sum(['le', 'service', 'instance']).increase('PT5M').histogram().histogram_percentile([50,75,90,95,99])

Sink

Sinks are the persistent layer of the LAL. An explicit sink {} block is required for any data to be persisted. Without a sink {} block, no data is saved — this applies to all LAL rules, including those using custom outputType.

Within the sink, you can use samplers, droppers, and enforcers to control which logs are persisted. An empty sink {} block means all logs are saved unconditionally.

Sampler

Sampler allows you to save the logs in a sampling manner. Currently, the following sampling strategies are supported:

rateLimit: samples n logs at a maximum rate of 1 minute. rateLimit("SamplerID") requires an ID for the sampler. Sampler declarations with the same ID share the same sampler instance, thus sharing the same rpm and resetting logic.
possibility: every piece of log has a pseudo possibility of percentage to be sampled, the possibility was generated by Java random number generator and compare to the given percentage option.

We welcome contributions on more sampling strategies. If multiple samplers are specified, the last one determines the final sampling result. See examples in Enforcer.

Examples 1, rateLimit:

filter {
    // ... parser

    sink {
        sampler {
            if (parsed.service == "ImportantApp") {
                rateLimit("ImportantAppSampler") {
                    rpm 1800  // samples at most 1800 logs per minute for service "ImportantApp"
                }
            } else {
                rateLimit("OtherSampler") {
                    rpm 180   // samples at most 180 logs per minute for other services than "ImportantApp"
                }
            }
        }
    }
}

Examples 2, possibility:

filter {
    // ... parser

    sink {
        sampler {
            if (parsed.service == "ImportantApp") {
                possibility(80) { // samples 80% of the logs for service "ImportantApp"
                }
            } else {
                possibility(30) { // samples 30% of the logs for other services than "ImportantApp"
                }
            }
        }
    }
}

Dropper

Dropper is a special sink, meaning that all logs are dropped without any exception. This is useful when you want to drop debugging logs.

filter {
    // ... parser

    sink {
        if (parsed.level == "DEBUG") {
            dropper {}
        } else {
            sampler {
                // ... configs
            }
        }
    }
}

Or if you have multiple filters, some of which are for extracting metrics, only one of them has to be persisted.

filter { // filter A: this is for persistence
    // ... parser

    sink {
        sampler {
            // .. sampler configs
        }
    }
}
filter { // filter B:
    // ... extractor to generate many metrics
    extractor {
        metrics {
            // ... metrics
        }
    }
    sink {
        dropper {} // drop all logs because they have been saved in "filter A" above.
    }
}

Enforcer

Enforcer is another special sink that forcibly samples the log. A typical use case of enforcer is when you have configured a sampler and want to save some logs forcibly, such as to save error logs even if the sampling mechanism has been configured.

filter {
    // ... parser

    sink {
        sampler {
            // ... sampler configs
        }
        if (parsed.level == "ERROR" || parsed.userId == "TestingUserId") { // sample error logs or testing users' logs (userId == "TestingUserId") even if the sampling strategy is configured
            enforcer {
            }
        }
    }
}

Output Type

By default, each LAL rule produces a Log source object that is persisted to storage. However, some use cases require transforming log data into a different entity type — for example, converting slow SQL logs into DatabaseSlowStatement records or network profiling logs into SampledTraceRecord. The outputType mechanism makes this configurable per rule, without requiring any DSL grammar changes.

Configuration

Set outputType at the rule level in the YAML config. You can use the short name registered by the LALOutputBuilder SPI (recommended), or a fully qualified class name:

rules:
  - name: my-rule
    layer: MYSQL
    outputType: SlowSQL    # short name registered by DatabaseSlowStatementBuilder
    dsl: |
      filter {
        // ...
      }

Resolution order

The output type is resolved per-rule in the following priority:

Per-rule YAML config — the outputType field shown above (highest priority). Short names (no .) are resolved via ServiceLoader<LALOutputBuilder>; fully qualified class names are resolved via Class.forName() as a fallback.
LALSourceTypeProvider SPI — a default output type registered by receiver plugins for a specific layer
Log.class — the fallback if not specified anywhere

Two output paths

LAL supports two kinds of output types:

Output path	Base type	How it works
Log path	Subclass of `AbstractLog`	The sink populates standard log fields (service, instance, endpoint, tags, body, etc.) from `LogData` and persists via `SourceReceiver`
Builder path	Implements `LALOutputBuilder`	The sink creates the builder, calls `init(LogData, Optional<Object> extraLog, NamingControl)` to pre-populate standard fields, applies output field values via setters, then calls `complete(SourceReceiver)` to validate and dispatch

The builder path is used when the output type implements the LALOutputBuilder interface. This is how SkyWalking’s built-in slow SQL and sampled trace features work.

Built-in output types

Slow SQL (Database Slow Statement)

SkyWalking converts slow SQL logs into DatabaseSlowStatement records for database slow query analysis (MySQL, PostgreSQL, Redis, etc.).

Use outputType: SlowSQL in your rule config. The available output fields are: id, statement, latency. Standard fields (service, layer, timestamp) are handled by the extractor as usual and pre-populated via init() from LogData.

We require a log tag "LOG_KIND" = "SLOW_SQL" to make OAP distinguish slow SQL logs from other log reports.

Note, slow SQL sampling would only flag this SQL in the candidate list. The OAP server would run statistic per service and only persistent the top 50 every 10(controlled by topNReportPeriod: ${SW_CORE_TOPN_REPORT_PERIOD:10}) minutes by default.

See bundled LAL scripts for complete examples: mysql-slowsql.yaml, pgsql-slowsql.yaml, redis-slowsql.yaml.

Sampled Trace (Network Profiling)

SkyWalking converts network profiling sampled trace logs into SampledTraceRecord for process-level network analysis.

Use outputType: SampledTrace in your rule config. The available output fields are: latency, uri, reason, processId, destProcessId, detectPoint, componentId.

We require a log tag "LOG_KIND" = "NET_PROFILING_SAMPLED_TRACE" to make OAP distinguish sampled trace logs from other log reports.

See bundled LAL scripts for complete examples: envoy-als.yaml, k8s-service.yaml, mesh-dp.yaml.

Extending: custom output types

For developers who need to create custom output types (implementing LALOutputBuilder, extending AbstractLog, registering LALSourceTypeProvider SPI, or defining custom input types), see the LAL Extension Developer Guide.

SkyWalking