tlshdiff()

The tlshdiff() function compares two TLSH hash strings and returns the similarity score as an integer.

Syntax

tlshdiff(HASH1, HASH2)

Parameters

HASH1
The first TLSH hash string to compare.
HASH2
The second TLSH hash string to compare.

Description

The tlshdiff() function compares two TLSH hash strings and returns the similarity score as an integer. A lower score indicates that the two data sets are more similar; a score of 0 means the data is identical. A higher score indicates a greater difference between the two data sets.

Returns null if HASH1 or HASH2 is null or not a string type. Returns -1 if an invalid TLSH hash string is passed or if an error occurs during comparison.

Hash strings computed with the tlsh() function can be passed to this function to measure the similarity between two data sets.

Error codes

N/A

Usage examples

  1. Compare the similarity of two files

    logdb://files
    | eval h1 = tlsh(file1_content), h2 = tlsh(file2_content),
           diff = tlshdiff(h1, h2)
    | # diff: 45
    
  2. Compare identical hashes (similarity score of 0)

    json "{}"
    | eval data = randbytes(256),
           h = tlsh(data),
           diff = tlshdiff(h, h)
    | # diff: 0
    
  3. null input

    json "{'h1': null, 'h2': 'T1A3E...'}" | eval diff = tlshdiff(h1, h2)
    | # diff: null
    
  4. Invalid hash string

    json "{}" | eval diff = tlshdiff("invalid", "invalid")
    | # diff: -1
    

Compatibility

The tlshdiff() function has been available since Sonar 4.0.2308.0-u3043.