Skip to content

LE_DST

Return the difference between two character strings in terms of the number of changes required to convert the first string into the second. The comparison of the strings is case-sensitive. This function uses the Levenshtein edit distance algorithm to compute the "distance" between the strings. With this algorithm, a character transposition change such as 'ab' to 'ba' counts as two changes.

This function returns an integer.

Note: If one of the inputs is an empty string, the function simply returns the number of characters in the other argument, regardless of the current settings of these configuration parameters.

See also DLE_DST.

Configuration

Two configuration parameters affect the performance and memory usage of this function. See max_levenshtein_distance and max_levenshtein_string_size for more information.

Examples

The following example requires three changes, one for the case of the letter C, and two for the transposition of kc:

yellowbrick=# select le_dst('Chekcing','checking') from sys.const;
 le_dst 
--------
     3
(1 row)

Calculate the distance between the London team names and their nicknames in the team table:

premdb=# select name, nickname, le_dst(name,nickname) ledst from team 
where city='London' and name is not null order by 3 desc; 
       name         |  nickname  | ledst 
---------------------+------------+-------
 Queens Park Rangers | Hoops      |    18
 Charlton Athletic   | Addicks    |    16
 Tottenham Hotspur   | Spurs      |    15
 West Ham United     | Hammers    |    12
 Crystal Palace      | Eagles     |    12
 Wimbledon           | Dons       |     8
 Chelsea             | Pensioners |     8
 Fulham              | Cottagers  |     8
 Arsenal             | Gunners    |     7
(9 rows)

Calculate the distance between two specific strings with both LE_DST and DLE_DST. Note that the results are different for these particular strings because the functions count character transpositions differently.

premdb=# select le_dst('Arsenal','Gunners'),dle_dst('Arsenal','Gunners') from sys.const;
 le_dst | dle_dst 
--------+---------
     7 |       6
(1 row)

The following example shows the behavior of this function when max_levenshtein_string_size is set to 10. Because of the length of the specified strings, both queries return an expected error:

yellowbrick=# set max_levenshtein_string_size to 10;
SET
yellowbrick=# show max_levenshtein_string_size;
 max_levenshtein_string_size 
-----------------------------
 10
(1 row)
yellowbrick=# create table abc(c1 varchar);
CREATE TABLE
yellowbrick=# insert into abc values('abcdefghijklmnopqrstuvwxyz');
INSERT 0 1
yellowbrick=# insert into abc values('abcdefghijklm');
INSERT 0 1
yellowbrick=# select le_dst(c1,'abcdefghijklm') from abc;
ERROR:  LE_DST	argument exceeds maximum length of 10 bytes
HINT:	Change the max value with the 'max_levenshtein_string_size' configuration parameter
yellowbrick=# select le_dst('abcdefghijklm', c1) from abc;
ERROR:  LE_DST	argument exceeds maximum length of 10 bytes
HINT:	Change the max value with the 'max_levenshtein_string_size' configuration parameter

Also note that if the empty string is passed as one of the arguments in these queries, the result is the length of the other string:

yellowbrick=# select le_dst('', c1) from abc;
 le_dst 
--------
    13
    26
(2 rows)

yellowbrick=# select le_dst(c1, '') from abc;
 le_dst 
--------
    26
    13
(2 rows)

The following example shows the behavior when the max_levenshtein_distance parameter is set to 5 (and max_levenshtein_string_size is reset to its default value of 255):

yellowbrick=# reset max_levenshtein_string_size;
RESET
yellowbrick=# set max_levenshtein_distance to 5;
SET
yellowbrick=# select c1, le_dst('abcdefghijklm', c1) from abc;
            c1             | le_dst 
----------------------------+--------
 abcdefghijklmnopqrstuvwxyz |      6
 abcdefghijklm              |      0
(2 rows)

The first row returns 6 (max_levenshtein_string_size + 1) for the first row and 0 for the second row. The function returns 6 for the first value that has a distance greater than 5, then stops processing.