hbos

Calculates the anomaly score of each record using the Histogram-Based Outlier Score (HBOS) algorithm. For each field, a histogram is built, and the score reflects how rare the value is within its distribution.

Command properties

PropertyValue
Command typeTransforming
Required permissionNone
License usageN/A
Parallel executionNot supported
Distributed executionRuns on Control Node (reducer)

Syntax

To build a histogram model:

hbos op=build [k=INT] [alpha=FLOAT] [tolerance=FLOAT] FIELD, ... [by CLAUSE, ...] [ SUBQUERY ]

To calculate outlier scores using a histogram model:

hbos op=query [k=INT] [alpha=FLOAT] [tolerance=FLOAT] FIELD, ... [by CLAUSE, ...] [ SUBQUERY ]

Options

op={build|query}
Operation mode
  • build: Builds a histogram model from input records and outputs the serialized model data.
  • query: Builds or loads a histogram model via subquery, then calculates an outlier score for each record.
k=INT
Number of buckets (bins) in the histogram. Must be a positive integer. (Default: 10)
alpha=FLOAT
Smoothing parameter used to calculate dynamic bucket boundaries. Must be a real number between 0 and 1 (exclusive). (Default: 0.1)
tolerance=FLOAT
Tolerance for bucket boundary determination. Must be a real number between 0 and 1 (exclusive). (Default: 0.1)

Target

FIELD, ...
List of fields to use for outlier score calculation. Separate multiple fields with commas (,). Field values must be numeric.
[by CLAUSE, ...]
Grouping fields. When specified, a separate histogram model is built for each group.
[ SUBQUERY ]
A subquery (enclosed in square brackets [ ]) that retrieves a previously built HBOS model for use with op=query. When specified, outlier scores are calculated in real time.

Output fields

When op=build:

FieldTypeDescription
_hbos_bykeystringGrouping key value
_hbos_modelobjectSerialized histogram model data

When op=query:

FieldTypeDescription
_hbos_scoredoubleOutlier score. A higher value indicates a greater likelihood of anomaly.

Error codes

Parsing errors
Error codeMessageDescription
40700hbos 명령의 대상 필드가 누락되었습니다.No target field specified for analysis
40701hbos 명령의 k는 자연수이어야 합니다.k value is 0 or negative
40702hbos 명령의 alpha는 0에서 1 사이의 실수여야 합니다.alpha value is out of range
40703hbos 명령의 tolerance는 0에서 1 사이의 실수여야 합니다.tolerance value is out of range
40704hbos 명령의 op은 build 또는 query여야 합니다.op option is neither build nor query
40705hbos 명령의 그룹 필드가 누락되었습니다.No field specified in the by clause
40804머신러닝 라이선스가 필요합니다.Machine learning license is not available
90204[가 짝이 맞지 않습니다.Unmatched bracket in subquery
Runtime errors

N/A

Description

The hbos command detects multivariate outliers using the HBOS (Histogram-Based Outlier Score) algorithm. It independently builds a histogram for each field, calculates how rare each value is within its distribution, and sums the scores across all fields.

When run with op=build, it builds a histogram model from input records, serializes the model data, and outputs it as _hbos_bykey and _hbos_model fields. You can store the output in a table using the import command and then use it as a subquery for op=query.

When run with op=query, it builds or loads a histogram model and assigns an outlier score to the _hbos_score field for each input record.

When a by clause is specified, a separate histogram model is built for each group.

Examples

  1. Calculate outlier scores in real time

    table duration=1d network_logs
    | eval bytes = long(bytes), pkts = long(pkts)
    | hbos op=query bytes, pkts
    | sort +_hbos_score
    

    Calculates an HBOS outlier score for each record based on the bytes and pkts fields, then sorts by score.

  2. Build and save a model

    table duration=7d network_logs
    | eval bytes = long(bytes), pkts = long(pkts)
    | hbos op=build bytes, pkts
    | import hbos_model
    

    Builds an HBOS model from 7 days of data and saves it to the hbos_model table.

  3. Outlier detection using a pre-built model

    table duration=1d network_logs
    | eval bytes = long(bytes), pkts = long(pkts)
    | hbos op=query bytes, pkts [ table hbos_model ]
    | search _hbos_score > 10
    

    Calculates outlier scores using the pre-built model in the hbos_model table, and retrieves records with a score greater than 10.

  4. Per-group outlier detection

    table duration=1d network_logs
    | eval bytes = long(bytes), pkts = long(pkts)
    | hbos op=query bytes, pkts by src_ip
    

    Builds a separate histogram model for each src_ip and calculates outlier scores.