rex

Extracts values from a text field using named capture groups in a regular expression. JDK and RE2/J regular expression engines are supported.

Command properties

ItemDescription
Command typeProcessing query
Required permissionNone
License usageN/A
Parallel executionSupported
Distributed executionNot supported

Syntax

rex field=FIELD [engine={jdk|re2j|jdk-re2j|jdk-re2j-lax}] [debug={t|f}] "REGEX"

Options

field=FIELD (required)
Name of the target field to which the regular expression is applied.
engine={jdk|re2j|jdk-re2j|jdk-re2j-lax}
Regular expression engine setting (default: jdk-re2j-lax)
  • jdk: The default Java regular expression engine.
  • re2j: Google's RE2/J regular expression engine.
  • jdk-re2j: Uses the Java engine (jdk) but automatically switches to RE2/J if execution time becomes excessive. The regular expression must be valid in both the Java engine and RE2/J.
  • jdk-re2j-lax: Uses the Java engine (jdk) but automatically switches to RE2/J if execution time becomes excessive. The regular expression must be valid in the Java engine. The switch occurs only if the regular expression is also valid in RE2/J.
debug={t|f}
Debug setting. When set to t, outputs the engine type used for matching in the _engine field. (default: f)

Target

"REGEX" (required)
Regular expression. Specify the fields to extract using named capture groups in the format (?<field_name>pattern). The capture group name becomes the output field name.

Output fields

The input field specified by the field option and fields extracted from the named capture groups in the regular expression are output. If debug=t is set, the _engine field is also added.

FieldTypeDescription
(extracted)stringValue matched by the named capture group ((?<name>...))
_enginestringWhen debug=t, the name of the regular expression engine used (JDK, RE2/J)

Error codes

Parse errors
Error codeMessageDescription
20900The value of the field option is missing.The field option is not specified
20901There is an error in the regular expression you entered. Check the format again.The regular expression syntax is incorrect
20902There is an error in the regular expression engine setting. Check the value again.The engine option value is incorrect
20903There is an error in the regular expression you entered (JDK). Check the format again: [message]JDK regular expression compilation failed
20904There is an error in the regular expression you entered (RE2/J). Check the format again: [message]RE2/J regular expression compilation failed
Runtime errors
Error codeMessageDescriptionPost-action
20905The regular expression engine execution step limit has been reached: count=[count] limit=[limit]The regular expression execution step limit has been reachedQuery cancelled

Description

The rex command extracts values from a text field using named capture groups ((?<name>...)) in a regular expression. The capture group name becomes the output field name.

The default engine (jdk-re2j-lax) uses the JDK regular expression engine but automatically switches to the RE2/J engine when execution time is excessive for certain patterns. If switching to RE2/J is not possible and the step limit is reached, the query is cancelled.

The JDK regular expression engine counts the number of execution steps while processing an input string. The default step limit is 300,000,000 steps. You can change this value using the system property araqne.logdb.regex.jdk_regex_step_limit.

Examples

  1. Extracting a filename from an HTTP request

    table duration=1h WEB_LOGS
    | rex field=line "(GET|POST) /game/flash/(?<filename>[^ ]*)"
    

    Extracts the filename from GET or POST requests in the line field into a filename field.

  2. Extracting a timestamp

    table duration=1h APP_LOGS
    | rex field=line "(?<timestamp>\d+-\d+-\d+ \d+:\d+:\d+)"
    

    Extracts a timestamp in yyyy-MM-dd HH:mm:ss format from the line field.

  3. Extracting a URL and query string

    table duration=1h WEB_LOGS
    | rex field=line "(GET|POST) (?<url>[^ ]*) (?<querystring>[^ ]*) "
    

    Extracts the URL and query string from the line field into url and querystring fields respectively.

  4. Checking the engine used in debug mode

    table duration=1h WEB_LOGS
    | rex field=line debug=t "(?<method>GET|POST) (?<path>/[^ ]*)"
    

    Outputs JDK or RE2/J in the _engine field, showing which regular expression engine was actually used.