stats

Aggregates input records by group. If no by clause is specified, all input is treated as a single group.

Command properties

ItemDescription
Command typeTransforming
Required permissionNone
License usageN/A
Parallel executionSupported
Distributed executionRuns on Control Node (reducer)

Syntax

stats [parallel=BOOL] AGG_FUNC [as ALIAS], ... [by FIELD, ...]

Options

parallel=BOOL
Whether to use parallel aggregation. Specify t or f.

Target

AGG_FUNC [as ALIAS], ...
Aggregation functions with optional aliases. Specify multiple aggregation functions separated by commas (,). Use the as keyword to specify the output field name.
[by FIELD, ...]
Grouping fields. Specify multiple fields separated by commas (,).

Output fields

Outputs the group fields and the aggregation function result fields. Fields specified in the by clause are output first, followed by the aggregation function results. Output is sorted by group field values.

Error codes

Parse errors
Error codeMessageDescription
21700Enter an aggregate function.No aggregation function is specified.
21701Invalid query.The by clause has an invalid syntax.
21702{function} is an unsupported function.An unsupported aggregation function is used.
21703Cannot create {function} function. {msg}Failed to create the aggregation function.
Runtime errors

N/A

Description

The stats command groups input records by the fields specified in the by clause and outputs the result of applying the aggregation functions to each group. If the by clause is omitted, all input is treated as a single group. Output is sorted by group field values.

In a distributed environment, the final aggregation is performed on the Control Node.

When the in-memory buffer exceeds the default threshold of 100,000 records during aggregation, intermediate results are flushed to disk and the final result is produced via parallel merge sort when the query completes. Memory usage and disk I/O increase as the number of groups in the by clause or the number of input records grows.

Examples

  1. Aggregate total record count

    table duration=1h web_logs | stats count
    

    Aggregates the total number of web log records from the past hour.

  2. Aggregate by group

    table duration=1h web_logs | stats count by method
    

    Counts the number of requests by HTTP method.

  3. Use multiple aggregation functions

    table duration=1h web_logs | stats count, sum(bytes) as total_bytes, avg(bytes) as avg_bytes by method
    

    Aggregates request count, total bytes transferred, and average bytes per request by HTTP method.

  4. Use multiple grouping fields

    table duration=1d web_logs | stats count by method, status
    

    Counts requests by the combination of HTTP method and status code.

  5. Aggregate inline data

    json "[{'dept': 'sales', 'amount': 100}, {'dept': 'sales', 'amount': 200}, {'dept': 'dev', 'amount': 150}]"
    | stats sum(amount) as total, count by dept
    

    Aggregates total amount and count by department.