csvfile

Loads data in a comma-separated values (CSV) or tab-separated values (TSV) file. This loads the header information in the first line of the CSV or TSV file and uses it as a field name.

Syntax

csvfile [OPTIONS] PATH
Required Parameter
PATH
Path to the file from which you want to load the data. Using a wildcard (*) in the file name, you can load all files containing a specific string pattern in the file name at once. For example, if you enter allow-*.csv in PATH, you can load all files, such as "allow-ip.csv", "allow-user.csv", "allow-url.csv" at once. To load a file, the Logpresso daemon must have the read permission to the file.
Optional Parameter
cs=CHARSET
Character set (default: utf-8). This option is case-insensitive. Use the preferred MIME name or aliases registered in the following document as CHARSET: http://www.iana.org/assignments/character-sets/character-sets.xhtml
limit=INT
Maximum number of records to load (default: unlimited).
maxcol=INT
Maximum number of columns to load (default: 10,000). If the maximum number of columns is exceeded, use the rest option to define the processing method.
offset=INT
Number of records to skip (default: 0).
rest=BOOL
Boolean option to process the column data exceeding the maximum number specified by the maxcol option (default: f). See usage #3 and #4.
  • t: Puts data beyond the maximum number of columns specified by the maxcol option in the _rest field.
  • f: Discards the rest of the columns beyond the maximum number of columns specified by the maxcol option.
strict=BOOL
Compliance with RFC4180 (https://tools.ietf.org/html/rfc4180) (default: f). See Usage #5 to #8.
  • t: Strictly parses to conform RFC 4180 as the same as when you open the CSV file with Microsoft Excel. This option cannot be used when tab=t.
  • f: Flexibly parses the CSV file.
tab=BOOL
Option to use tab character as a separator (default: f).
  • t: Uses tab character as a separator. This is useful for processing tab-separated values (TSV) files.
  • f: Uses comma (,) as a separator.

Usage

  1. Read the /opt/logpresso/wp-nginx.csv file.

    # Download: https://raw.githubusercontent.com/logpresso/dataset/main/wp-nginx.csv
    | csvfile /opt/logpresso/wp-nginx.csv
    
  2. Read 20 records after skipping the header line of /opt/logpresso/wp-nginx.csv file.

    csvfile limit=20 offset=1 /opt/logpresso/wp-nginx.csv
    
  3. Read only 4 columns in the /opt/logpresso/wp-nginx.csv file.

    csvfile maxcol=4 /opt/logpresso/wp-nginx.csv
    
  4. Read only 4 columns from the /opt/logpresso/wp-nginx.csv and assign the rest to the _rest field.

    csvfile maxcol=4 rest=t /opt/logpresso/wp-nginx.csv
    
  5. Data with a white space between the separator and the column. Compare the results of each query example.

    When strict=t, if there is a whitespace between the separator and the column, the double quotes (") are recognized as a character and are not parsed as intended.

    # Download: https://raw.githubusercontent.com/logpresso/dataset/main/csvfile-strict-option-test-1.csv
    | csvfile strict=t /opt/logpresso/csvfile-strict-option-test-1.csv
    

    When strict=f, if the pair of double quotes ("") is matched, only the strings inside the pair of quotes are recognized as columns, so it is parsed as intended.

    csvfile strict=f /opt/logpresso/csvfile-strict-option-test-1.csv
    
  6. Data without a white space between the separator and the column.

    Regardless of the strict value, there is no whitespace between the separator and the column, so it is parsed as intended.

    # Download: https://raw.githubusercontent.com/logpresso/dataset/main/csvfile-strict-option-test-2.csv
    | csvfile strict=t /opt/logpresso/csvfile-strict-option-test-2.csv
    
    csvfile strict=f /opt/logpresso/csvfile-strict-option-test-2.csv
    
  7. Data in which double quote characters (") are escaped with a backslash (\).

    When strict=t, the command recognizes the escape character (\) as a general character, so if you use \" when writing double quotes (") in a column enclosed in a pair of double quotes (" "), it is not parsed as intended.

    # Download: https://raw.githubusercontent.com/logpresso/dataset/main/csvfile-strict-option-test-3.csv
    | csvfile strict=t /opt/logpresso/csvfile-strict-option-test-3.csv
    

    When strict=f, two consecutive double quotes ("") and an escaped double quote (\") are parsed as a double quote within the column as intended.

    csvfile strict=f /opt/logpresso/csvfile-strict-option-test-1.csv
    csvfile strict=f /opt/logpresso/csvfile-strict-option-test-3.csv