split

Splits a string field by a delimiter and assigns each token to the specified field names or auto-generated field names.

Command properties

ItemDescription
Command typeTransforming
Required permissionNone
License usageN/A
Parallel executionSupported
Distributed executionRuns on Data Node (mapper)

Syntax

split sep=STR [field=STR] [overlay={t|f}] [FIELD, ...]

Options

sep=STR
Delimiter string to use for splitting
field=STR
Name of the target field to split (Default: line)
overlay={t|f}
Whether to retain the original record's fields (Default: f)
  • t: Merges the split results into the original record. Original fields are retained.
  • f: Creates a new record containing only the split results. Original fields are removed.
FIELD, ...
List of field names to assign to each split token, separated by commas. If omitted, fields are auto-named as column0, column1, etc. If there are more tokens than specified field names, the excess tokens are auto-assigned in column{N} format.

Input fields

FieldTypeRequiredDescription
linestringOptionalDefault target field for splitting. Can be changed using the field option.

Output fields

FieldTypeDescription
FIELDstringUser-specified field names. Split tokens are assigned in order.
column{N}stringAuto-generated when no field name is specified or when tokens exceed the number of specified fields.

Error codes

Parse errors
Error codeMessageDescription
22600Specify a separator string in the sep option.The sep option is not specified.
Runtime errors

N/A

Description

The split command splits the string value of the specified field by the delimiter. If the target field value is null, the original record is passed through unchanged. If the target field value is not a string, splitting is skipped and the original record is passed through.

When overlay=f (default), a new record containing only the split results is created, so all other fields from the original record are removed. Use overlay=t to preserve the original fields.

If there are more split tokens than the number of specified field names, the excess tokens are auto-named in column{N} format (starting from 0).

Examples

  1. Split a tab-delimited string into fields

    json "{'line': '192.0.2.1\t80\tGET'}" | split sep="\t" src_ip, port, method
    

    Splits the line field by the tab character and assigns the parts to src_ip, port, and method fields.

  2. Split while retaining the original fields

    json "{'msg': 'error:timeout:5000', 'level': 'WARN'}"
    | split sep=":" field=msg overlay=t type, reason, code
    

    Splits the msg field by colon, while using overlay=t to retain the original msg and level fields.

  3. Auto-naming when field names are omitted

    json "{'line': 'a,b,c,d'}" | split sep=","
    

    When no field names are specified, tokens are auto-assigned to column0, column1, column2, and column3.

Compatibility

The split command has been available since before Sonar 4.0.

See also

  • parse — Extract structured fields using a predefined parser or text anchor
  • parsekv — Parse key=value format strings
  • rex — Extract fields using regular expressions
  • split() — Function that splits a string by a delimiter and returns an array