dbload
Loads input records into an external database table. The JDBC profile must be registered in advance. The dbload command converts all records passed from the previous command into SQL queries and inserts them into the external database.
Command properties
| Item | Description |
|---|---|
| Command type | Processing query |
| Required permission | None |
| License usage | N/A |
| Parallel execution | Not supported |
| Distributed execution | Runs on Control Node (reducer) |
Syntax
Options
type={insert|update}- Loading mode. Default:
insert
insert: Inserts data using INSERT queries.update: Checks whether data exists with a SELECT query based on key fields. Inserts if not present, updates if present.
database=STR- Name of the database or schema to switch to after connecting.
table=STR- Name of the target table to insert data into.
batchsize=INT- Transaction batch size. A larger batch size is more efficient because more records are committed at once, but the number of records rolled back on transaction failure also increases. A value of 2,000 is recommended. The default value of 1 auto-commits every record, which can be very slow. Default:
1 rowretry=BOOL- Whether to retry on a per-row basis when a batch transaction fails. When set to
t, each record in the failed batch is retried as an individual transaction. This may reduce performance but minimizes data loss. Default:t stoponfail=BOOL- Whether to cancel the entire query on commit failure. Default:
f
Target
PROFILE- JDBC profile name. Can be specified as a query parameter.
[+]FIELD, ...- Comma-separated list of field names to load. Prefix a field name with
+to designate it as a key field. Whentype=update, at least one key field must be specified.
Error codes
Parse errors
| Error code | Message | Description |
|---|---|---|
| 12200 | Invalid JDBC profile name: [name] | The specified JDBC profile does not exist. |
| 12201 | No permission to use JDBC profile. | The current user does not have permission to use the JDBC profile. |
| 12202 | Invalid query. | The command syntax is incorrect (for example, ends with a comma). |
| 12203 | Enter the table option. | The table option is not specified. |
| 12204 | Enter a field. | No fields to load are specified. |
| 12205 | Enter a key field for update. | type=update is specified but no key field (with + prefix) is given. |
Runtime errors
N/A
Description
The dbload command loads records passed from the previous command into an external database table registered as a JDBC profile.
With type=insert (default), field values are converted to INSERT queries and inserted into the database.
With type=update, the command first runs a SELECT COUNT(*) query based on key fields (with + prefix) to check whether the record exists. If not present, it runs INSERT; if present, it runs UPDATE.
When a batch transaction fails, behavior depends on the rowretry option. With rowretry=t (default), each record in the failed batch is retried individually. If an individual record fails, _error_msg, _error_code, and _sql_state fields are added to that record. With rowretry=f, all records in the failed batch receive the error fields.
With stoponfail=t, the entire query is immediately cancelled when a batch commit fails.
Records with duplicate key field values within the same batch are automatically ignored.
In a distributed environment, loading runs on the Control Node.
Examples
-
Loading data in INSERT mode
json "[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]" | dbload mydb table=users name, ageInserts
nameandagefield values into theuserstable using themydbJDBC profile. -
Bulk loading with a specified batch size
table duration=1d web_logs | dbload mydb batchsize=2000 table=access_logs src_ip, method, uri, statusLoads large amounts of data efficiently into the
access_logstable with a batch size of 2,000. -
Loading data in UPDATE mode
json "[{'id': 1, 'name': 'Alice', 'score': 95}, {'id': 2, 'name': 'Bob', 'score': 88}]" | dbload mydb type=update table=students +id, name, scoreUses the
idfield as the key to INSERT if the record does not exist, or UPDATE if it does. -
Loading after switching the database
json "[{'host': '192.0.2.1', 'status': 'up'}]" | dbload mydb database=monitoring table=host_status host, statusSwitches to the
monitoringdatabase and loads data into thehost_statustable. -
Stopping the query on commit failure
json "[{'id': 1, 'value': 'test'}]" | dbload mydb batchsize=1000 stoponfail=t table=critical_data +id, valueWith
stoponfail=t, the entire query is immediately cancelled when a batch commit fails.