parsexml

Parses an XML string in a text field and extracts the child elements of the root element as fields.

Command properties

ItemDescription
Command typeProcessing query
Required permissionNone
License usageN/A
Parallel executionSupported
Distributed executionNot supported

Syntax

parsexml [field=STR] [overlay=BOOL]

Options

field=STR
Name of the field to parse (default: line)
overlay=BOOL
When set to t, preserves the original record's fields and overlays the fields extracted from XML onto them. When not specified, outputs only the fields parsed from XML. (default: f)

Error codes

Parse errors

N/A

Runtime errors

N/A

Description

The parsexml command parses the XML string in the specified field and extracts the child elements of the root XML element as fields.

  • When a child element contains only text, it is extracted as a string value.
  • When a child element has XML attributes, it is extracted as a map containing the attribute name/value pairs and the element's text content under the _text key.

To separate individual fields from a field extracted as a map, use the parsemap command.

If the target field is null or XML parsing fails, the record is output unchanged.

Examples

  1. Parsing an XML string

    json "{'line': '<doc><id>sample</id><name>Logpresso</name></doc>'}"
    | parsexml
    

    Extracts the child elements id and name of the root element doc as fields. The id field is assigned the string "sample" and the name field is assigned the string "Logpresso".

  2. Parsing XML that contains attributes

    json "{'line': '<doc><id>sample</id><name locale=\"ko\">Logpresso</name></doc>'}"
    | parsexml
    | parsemap field=name overlay=t
    

    Because the name element has a locale attribute, the name field is assigned a map containing two key/value pairs: locale=ko and _text=Logpresso. Use the parsemap command to separate the map into the individual locale and _text fields.

  3. Parsing while preserving the original fields

    json "{'src_ip': '192.0.2.1', 'line': '<doc><id>sample</id></doc>'}"
    | parsexml overlay=t
    

    Preserves the src_ip field from the original record and overlays the id field extracted from XML onto the output.