cube
Aggregates input records for all possible combinations of the group fields specified in the by clause. This produces multi-dimensional subtotals equivalent to SQL's GROUP BY CUBE.
Command properties
| Item | Description |
|---|---|
| Command type | Processing query |
| Required permission | None |
| License usage | N/A |
| Parallel execution | Supported |
| Distributed execution | Runs on Control Node (reducer) |
Syntax
Options
parallel=BOOL- Whether to use parallel aggregation. Specify
torf. label=STR- Label value to display in group fields for subtotal and grand total rows. When omitted, these fields are shown as
null.
Target
AGG_FUNC [as ALIAS], ...- Aggregate functions with optional aliases. Separate multiple functions with commas (
,). Use theaskeyword to specify the output field name. Note that order-based aggregate functions such aspercentileandmedian, as well as thedc(distinct count) function, are not supported. by FIELD, ...- Group fields. Separate multiple fields with commas (
,). Thebyclause is required for thecubecommand.
Output fields
Outputs the group fields followed by the aggregate function result fields. The fields specified in the by clause appear first, followed by the aggregate results. Output is sorted by group field values. In subtotal and grand total rows, group fields that are not part of the current aggregation grouping are displayed as the value specified in the label option or as null.
Error codes
Parse errors
| Error code | Message | Description |
|---|---|---|
| 21700 | Enter an aggregate function. | No aggregate function is specified. |
| 21701 | Invalid query. | The by clause syntax is incorrect. |
| 21702 | {function} is an unsupported function. | A non-existent aggregate function is used. |
| 21703 | Cannot create {function} function. {msg} | Failed to create the aggregate function. |
| 21704 | The grouping set command requires a BY clause. | No by clause is specified. |
| 21705 | {function} is not supported in rollup/cube. | An order-based aggregate function or the dc function is used. |
Runtime errors
N/A
Description
The cube command groups input records by the fields specified in the by clause and applies aggregate functions to all possible field combinations. Unlike the rollup command, which only generates hierarchical subtotals, cube generates subtotals for all dimensional combinations.
For example, with by a, b, the command outputs aggregations at the following four levels:
- Aggregation per
a,bcombination (detail) - Aggregation per
a(subtotal) - Aggregation per
b(subtotal) - Overall aggregation (grand total)
While rollup only outputs the subtotal by a and the grand total, cube additionally outputs the subtotal by b. With N group fields, cube generates 2^N combination aggregations.
In subtotal and grand total rows, group fields that are not part of the current aggregation grouping are filled with the string specified in the label option or null. Output is sorted by group field values.
In a distributed environment, the final aggregation runs on the Control Node.
The cube command accumulates aggregation results in an in-memory buffer while processing input records. The default buffer size is 100,000 records. When the buffer exceeds this threshold, intermediate results are flushed to disk. You can change this value using the system property araqne.logdb.aggregate_flush_size. In addition, with N group fields, cube generates 2^N combination aggregations, so memory usage and processing volume increase exponentially as the number of group fields grows.
Examples
-
Aggregating all combinations of two fields
json "[{'action':'login','status':'success','size':100},{'action':'login','status':'fail','size':50},{'action':'logout','status':'success','size':80},{'action':'logout','status':'success','size':120}]" | cube count, sum(size) as total_size by action, statusOutputs aggregations for all combinations of
actionandstatus, subtotals byaction, subtotals bystatus, and an overall grand total. -
Multi-dimensional aggregation with a label
table duration=1h web_logs | eval date=string(date, "yyyy-MM-dd") | cube label="TOTAL" count by date, statusAggregates counts for all combinations of date and status code. In subtotal and grand total rows, group fields that are not part of the current grouping display
TOTAL. -
Using multiple aggregate functions
json "[{'dept':'sales','region':'east','amount':100},{'dept':'sales','region':'west','amount':200},{'dept':'dev','region':'east','amount':150},{'dept':'dev','region':'west','amount':300}]" | cube label="ALL" count, sum(amount) as total, avg(amount) as avg_amount by dept, regionAggregates count, sum, and average for all combinations of
deptandregion. Subtotal and grand total group fields displayALL.