Skip to content

Trailing Blanks in Character Data

This section explains how CHAR and VARCHAR columns handle trailing blanks (space characters) when character data is loaded, stored, and queried.

A CHAR data type is conceptually a fixed-length, blank-padded string. The rules for handling trailing blanks in fixed-length character strings are as follows:

  • CHAR columns are stored with any trailing blanks trimmed. For example, values loaded into a CHAR(10) column as 'London' and 'London ' are stored with an equal length of 6 characters. The LENGTH function for both values returns 6.
  • A comparison of values such as 'London' and 'London ' (one trailing blank) or 'London' and 'London ' (two trailing blanks) returns a match.
  • Trailing blanks in CHAR columns are restored on output to the client. For example, a CHAR(10) column returns 10 characters per column value.

A VARCHAR data type is a variable-length character string. The rules for processing variable-length strings are as follows:

  • VARCHAR columns are stored and processed with any trailing blanks preserved. For example, values loaded into a VARCHAR(10) column as 'London' and 'London ' are stored with lengths of 6 and 8 characters, respectively.
  • A comparison of values such as 'London' and 'London ' does not return a match.
  • VARCHAR columns are returned to the client as variable-length strings. For example, a VARCHAR(10) column returns anything from 0 to 10 characters.

The examples in this section use a three-column version of the season table, which has the following schema:

premdb=# \d season
             Table "public.season"
   Column    |         Type          | Modifiers 
-------------+-----------------------+-----------
 seasonid    | smallint              | 
 season_name | character(12)         | 
 winners     | character varying(30) | 

Distribution: Hash (seasonid)

This table contains the following rows:

premdb=# select * from season order by 1;
 seasonid | season_name  |      winners      
----------+--------------+-------------------
      20 | 2011-2012    | Manchester City
      21 | 2012-2013    | Manchester United
      22 | 2013-2014    | Manchester City
      23 | 2014-2015    | Chelsea
      24 | 2015-2016    | Leicester City
(5 rows)

The following query returns the respective lengths of all of the character strings:

premdb=# select *, length(season_name) length_char, length(winners) length_varchar from season order by 1;
 seasonid | season_name  |      winners      | length_char | length_varchar 
----------+--------------+-------------------+-------------+----------------
      20 | 2011-2012    | Manchester City   |           9 |             15
      21 | 2012-2013    | Manchester United |           9 |             17
      22 | 2013-2014    | Manchester City   |           9 |             15
      23 | 2014-2015    | Chelsea           |           9 |              7
      24 | 2015-2016    | Leicester City    |           9 |             14
(5 rows)

Note that the length of the VARCHAR strings (winners) varies. The length of the CHAR strings (season_name) is consistently 9, regardless of the fact that the column was defined as CHAR(12).

Now insert two more rows as follows, and note the trailing blanks in the strings:

premdb=# insert into season values(25, '2016-2017 ', 'Chelsea  ');
INSERT 0 1
premdb=# insert into season values(26, '2017-2018 ', '');

Run the length query again:

premdb=# select *, length(season_name) length_char, length(winners) length_varchar from season;
 seasonid | season_name  |      winners      | length_char | length_varchar 
----------+--------------+-------------------+-------------+----------------
      20 | 2011-2012    | Manchester City   |           9 |             15
      21 | 2012-2013    | Manchester United |           9 |             17
      22 | 2013-2014    | Manchester City   |           9 |             15
      23 | 2014-2015    | Chelsea           |           9 |              7
      24 | 2015-2016    | Leicester City    |           9 |             14
      25 | 2016-2017    | Chelsea           |           9 |              9
      26 | 2017-2018    |                   |           9 |              0
(7 rows)

Note that the season_name column always returns a length of 9. The trailing blanks for rows 25 and 26 were trimmed. Further, note that the lengths of the Chelsea values for rows 23 and 25 are different. In row 25, the trailing blanks are preserved, making a length of 9, not 7.

Now run some queries that compare these character strings with literal values:

premdb=# select * from season where winners='Chelsea';
 seasonid | season_name  | winners 
----------+--------------+---------
      23 | 2014-2015    | Chelsea
(1 row)

premdb=# select * from season where winners like 'Chelsea%';
 seasonid | season_name  |  winners  
----------+--------------+-----------
      23 | 2014-2015    | Chelsea
      25 | 2016-2017    | Chelsea  
(2 rows)
premdb=# select * from season where season_name ='2016-2017 ';
 seasonid | season_name  |  winners  
----------+--------------+-----------
      25 | 2016-2017    | Chelsea  
(1 row)

Now create a new version of the season table called season_varchar, in which both character columns are defined as VARCHAR columns.

premdb=# \d season_varchar
         Table "public.season_varchar"
   Column    |         Type          | Modifiers 
-------------+-----------------------+-----------
 seasonid    | smallint              | 
 season_name | character varying(12) | 
 winners     | character varying(30) | 

Distribution: Hash (seasonid)

Insert the same data in the new table, including the trailing blanks. You now have two tables that appear to return identical result sets:

premdb=# select * from season_varchar order by 1;
 seasonid | season_name |      winners      
----------+-------------+-------------------
      20 | 2011-2012   | Manchester City
      21 | 2012-2013   | Manchester United
      22 | 2013-2014   | Manchester City
      23 | 2014-2015   | Chelsea
      24 | 2015-2016   | Leicester City
      25 | 2016-2017   | Chelsea  
      26 | 2017-2018   | 
(7 rows)

premdb=# select * from season order by 1;
 seasonid | season_name  |      winners      
----------+--------------+-------------------
      20 | 2011-2012    | Manchester City
      21 | 2012-2013    | Manchester United
      22 | 2013-2014    | Manchester City
      23 | 2014-2015    | Chelsea
      24 | 2015-2016    | Leicester City
      25 | 2016-2017    | Chelsea  
      26 | 2017-2018    | 
(7 rows)

However, when you join these tables on season_name, note the results:

premdb=# select * from season s, season_varchar sv where s.season_name=sv.season_name order by s.seasonid;
 seasonid | season_name  |      winners      | seasonid | season_name |      winners      
----------+--------------+-------------------+----------+-------------+-------------------
      20 | 2011-2012    | Manchester City   |       20 | 2011-2012   | Manchester City
      21 | 2012-2013    | Manchester United |       21 | 2012-2013   | Manchester United
      22 | 2013-2014    | Manchester City   |       22 | 2013-2014   | Manchester City
      23 | 2014-2015    | Chelsea           |       23 | 2014-2015   | Chelsea
      24 | 2015-2016    | Leicester City    |       24 | 2015-2016   | Leicester City
(5 rows)

Only 5 rows are returned, not 7. The trailing blanks preserved in the VARCHAR version of the season_name column (rows 25 and 26) cause a mismatch for those two rows.

Parent topic:SQL Data Types