split
Splits a string field by a delimiter and assigns each token to the specified field names or auto-generated field names.
Command properties
| Item | Description |
|---|---|
| Command type | Transforming |
| Required permission | None |
| License usage | N/A |
| Parallel execution | Supported |
| Distributed execution | Runs on Data Node (mapper) |
Syntax
Options
sep=STR- Delimiter string to use for splitting
field=STR- Name of the target field to split (Default:
line) overlay={t|f}- Whether to retain the original record's fields (Default:
f)
t: Merges the split results into the original record. Original fields are retained.f: Creates a new record containing only the split results. Original fields are removed.
FIELD, ...- List of field names to assign to each split token, separated by commas. If omitted, fields are auto-named as
column0,column1, etc. If there are more tokens than specified field names, the excess tokens are auto-assigned incolumn{N}format.
Input fields
| Field | Type | Required | Description |
|---|---|---|---|
line | string | Optional | Default target field for splitting. Can be changed using the field option. |
Output fields
| Field | Type | Description |
|---|---|---|
FIELD | string | User-specified field names. Split tokens are assigned in order. |
column{N} | string | Auto-generated when no field name is specified or when tokens exceed the number of specified fields. |
Error codes
Parse errors
| Error code | Message | Description |
|---|---|---|
| 22600 | Specify a separator string in the sep option. | The sep option is not specified. |
Runtime errors
N/A
Description
The split command splits the string value of the specified field by the delimiter. If the target field value is null, the original record is passed through unchanged. If the target field value is not a string, splitting is skipped and the original record is passed through.
When overlay=f (default), a new record containing only the split results is created, so all other fields from the original record are removed. Use overlay=t to preserve the original fields.
If there are more split tokens than the number of specified field names, the excess tokens are auto-named in column{N} format (starting from 0).
Examples
-
Split a tab-delimited string into fields
json "{'line': '192.0.2.1\t80\tGET'}" | split sep="\t" src_ip, port, methodSplits the
linefield by the tab character and assigns the parts tosrc_ip,port, andmethodfields. -
Split while retaining the original fields
json "{'msg': 'error:timeout:5000', 'level': 'WARN'}" | split sep=":" field=msg overlay=t type, reason, codeSplits the
msgfield by colon, while usingoverlay=tto retain the originalmsgandlevelfields. -
Auto-naming when field names are omitted
json "{'line': 'a,b,c,d'}" | split sep=","When no field names are specified, tokens are auto-assigned to
column0,column1,column2, andcolumn3.
Compatibility
The split command has been available since before Sonar 4.0.