dbload

Loads input records into an external database table. The JDBC profile must be registered in advance. The dbload command converts all records passed from the previous command into SQL queries and inserts them into the external database.

Command properties

ItemDescription
Command typeProcessing query
Required permissionNone
License usageN/A
Parallel executionNot supported
Distributed executionRuns on Control Node (reducer)

Syntax

dbload PROFILE [type={insert|update}] [database=STR] table=STR [batchsize=INT] [rowretry=BOOL] [stoponfail=BOOL] [+]FIELD, ...

Options

type={insert|update}
Loading mode. Default: insert
  • insert: Inserts data using INSERT queries.
  • update: Checks whether data exists with a SELECT query based on key fields. Inserts if not present, updates if present.
database=STR
Name of the database or schema to switch to after connecting.
table=STR
Name of the target table to insert data into.
batchsize=INT
Transaction batch size. A larger batch size is more efficient because more records are committed at once, but the number of records rolled back on transaction failure also increases. A value of 2,000 is recommended. The default value of 1 auto-commits every record, which can be very slow. Default: 1
rowretry=BOOL
Whether to retry on a per-row basis when a batch transaction fails. When set to t, each record in the failed batch is retried as an individual transaction. This may reduce performance but minimizes data loss. Default: t
stoponfail=BOOL
Whether to cancel the entire query on commit failure. Default: f

Target

PROFILE
JDBC profile name. Can be specified as a query parameter.
[+]FIELD, ...
Comma-separated list of field names to load. Prefix a field name with + to designate it as a key field. When type=update, at least one key field must be specified.

Error codes

Parse errors
Error codeMessageDescription
12200Invalid JDBC profile name: [name]The specified JDBC profile does not exist.
12201No permission to use JDBC profile.The current user does not have permission to use the JDBC profile.
12202Invalid query.The command syntax is incorrect (for example, ends with a comma).
12203Enter the table option.The table option is not specified.
12204Enter a field.No fields to load are specified.
12205Enter a key field for update.type=update is specified but no key field (with + prefix) is given.
Runtime errors

N/A

Description

The dbload command loads records passed from the previous command into an external database table registered as a JDBC profile.

With type=insert (default), field values are converted to INSERT queries and inserted into the database.

With type=update, the command first runs a SELECT COUNT(*) query based on key fields (with + prefix) to check whether the record exists. If not present, it runs INSERT; if present, it runs UPDATE.

When a batch transaction fails, behavior depends on the rowretry option. With rowretry=t (default), each record in the failed batch is retried individually. If an individual record fails, _error_msg, _error_code, and _sql_state fields are added to that record. With rowretry=f, all records in the failed batch receive the error fields.

With stoponfail=t, the entire query is immediately cancelled when a batch commit fails.

Records with duplicate key field values within the same batch are automatically ignored.

In a distributed environment, loading runs on the Control Node.

Examples

  1. Loading data in INSERT mode

    json "[{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]"
    | dbload mydb table=users name, age
    

    Inserts name and age field values into the users table using the mydb JDBC profile.

  2. Bulk loading with a specified batch size

    table duration=1d web_logs
    | dbload mydb batchsize=2000 table=access_logs src_ip, method, uri, status
    

    Loads large amounts of data efficiently into the access_logs table with a batch size of 2,000.

  3. Loading data in UPDATE mode

    json "[{'id': 1, 'name': 'Alice', 'score': 95}, {'id': 2, 'name': 'Bob', 'score': 88}]"
    | dbload mydb type=update table=students +id, name, score
    

    Uses the id field as the key to INSERT if the record does not exist, or UPDATE if it does.

  4. Loading after switching the database

    json "[{'host': '192.0.2.1', 'status': 'up'}]"
    | dbload mydb database=monitoring table=host_status host, status
    

    Switches to the monitoring database and loads data into the host_status table.

  5. Stopping the query on commit failure

    json "[{'id': 1, 'value': 'test'}]"
    | dbload mydb batchsize=1000 stoponfail=t table=critical_data +id, value
    

    With stoponfail=t, the entire query is immediately cancelled when a batch commit fails.