dc()
Calculates the distinct count of field values in a group.
dcount is an alias for dc.
Syntax
Parameters
EXPR- An expression that returns the field whose distinct values are to be counted.
Description
The dc() function accumulates the values returned by EXPR into an internal set (HashSet) as it processes records in a group. Null values are not added to the set. When aggregation is complete, it returns the size of the set — that is, the number of distinct values after deduplication — as a 64-bit integer (long).
The dc() function calculates the exact distinct count. If memory usage is a concern or an approximation is sufficient, use the estdc() function instead.
The dc() function does not support distributed aggregation. When you need an exact distinct count in a distributed environment, structure your query so that all data is processed on a single node.
This function can only be used in aggregation commands such as stats and timechart.
Error codes
| Code | Description |
|---|---|
| 91010 | The number of arguments is wrong. |
Usage examples
To prepare the WEB_APACHE_SAMPLE table used in these examples, refer to Preparing sample data.
-
Count the number of unique source IP addresses
table WEB_APACHE_SAMPLE | stats dc(src_ip) -
Count the number of unique URIs per HTTP method
table WEB_APACHE_SAMPLE | stats dc(uri) by method -
Null value handling
json "[{'val': 10}, {'val': null}, {'val': 10}, {'val': 30}]" | stats dc(val) | # dc(val): 2
Compatibility
The dc() function has been available since before Logpresso Sonar 4.0.