join
Merges subquery results based on join keys.
Command properties
| Property | Value |
|---|---|
| Command type | Transforming |
| Required permission | None |
| License usage | N/A |
| Parallel execution | Supported |
| Distributed execution | Runs on Control Node (reducer) |
Syntax
Options
type={inner|left|leftonly|right|rightonly|cross|full}- Join type. (Default:
inner)
inner: Outputs only records where both sides have a matching join key.left: Outputs all left (input) records and merges right (subquery) fields where the join key matches.leftonly: Outputs only left records that have no matching key on the right.right: Outputs all right (subquery) records and merges left (input) fields where the join key matches.rightonly: Outputs only right records that have no matching key on the left.cross: Outputs the Cartesian product of both sides. No join key is specified.full: Outputs all records from both sides and merges fields where the join key matches.
Target
FIELD, ...- Join key fields. Separate multiple fields with commas (
,). Not specified forcrossjoins. [ SUBQUERY ]- A subquery enclosed in square brackets (
[]). The subquery result is used as the right-side dataset.
Error codes
Parsing errors
| Error code | Message | Description |
|---|---|---|
| 20500 | 지원되지 않는 조인 방식입니다. inner, left, leftonly, right, rightonly, cross, full 중 하나를 사용하세요. | An unsupported join type was specified |
| 20501 | 조인 키를 지정하세요. | No join key field was specified |
Runtime errors
N/A
Description
The join command merges input records (left) with subquery results (right) based on join keys. The subquery runs first. If the subquery result contains 100,000 records or fewer, a hash join is used; otherwise, a sort-merge join is used.
Adding a + or - prefix to a join key field forces a sort-merge join and specifies the sort order for that field.
The join command cannot be used in stream queries; use the streamjoin command instead.
In a distributed environment, the join is performed on Control Nodes.
The default hash join threshold is 100,000 records. You can change this value using the system property araqne.hashjointhreshold.
Examples
-
Inner join
json "[{'src_ip': '192.0.2.1', 'bytes': 1024}, {'src_ip': '192.0.2.2', 'bytes': 2048}]" | join src_ip [ json "[{'src_ip': '192.0.2.1', 'hostname': 'web-01'}, {'src_ip': '192.0.2.3', 'hostname': 'db-01'}]" ]Performs an inner join on the
src_ipfield. Only the192.0.2.1record, which matches on both sides, is output. -
Left join
json "[{'src_ip': '192.0.2.1', 'bytes': 1024}, {'src_ip': '192.0.2.2', 'bytes': 2048}]" | join type=left src_ip [ json "[{'src_ip': '192.0.2.1', 'hostname': 'web-01'}]" ]Outputs all left records and merges matching right fields. The
192.0.2.2record is also output but has nohostnamefield. -
Left-only join
table duration=1h web_logs | join type=leftonly src_ip [ table duration=1d allowlist ]Outputs only web log records whose IP address is not in the allowlist.
-
Cross join
json "[{'name': 'Alice'}, {'name': 'Bob'}]" | join type=cross [ json "[{'role': 'admin'}, {'role': 'user'}]" ]Outputs all combinations of both sides' records. A total of 4 records are output.
-
Use multiple join keys
table duration=1h web_logs | join src_ip, dst_port [ table duration=1d service_map ]Merges using both
src_ipanddst_portas join keys.