PostgreSQL Pattern Matching (like, regexp) with examples

There are three separate approaches to pattern matching provided by PostgreSQL

  • LIKE
  • Regular Expressions
  • SIMILAR_TO

LIKE

LIKE and NOT LIKE yield boolean values (also ILIKE and NOT ILIKE )

Consider below table EMP having 9 rows

postgres=# select * from emp; 
 empno | empname  | salary |    dob     | deptno 
-------+----------+--------+------------+--------
    10 | Bill     |  12000 | 1977-01-16 |      5
    11 | Solomon  |  10000 | 1964-06-25 |      5
    12 | Susan    |  10000 | 1982-02-26 |      5
    13 | Wendy    |   9000 | 1955-03-10 |      1
    14 | Benjamin |   7500 | 1980-02-28 |      1
    15 | Tom      |   7600 | 1969-01-09 |      1
    16 | Henry    |   8500 | 1972-04-19 |      2
    17 | Robert   |   9500 | 1979-05-31 |      2
    18 | Paul     |   7700 | 1960-07-04 |      2
(9 rows)

See two examples, one with like and other with not like

postgres=# select * from emp where empname like 'S%';
 empno | empname | salary |    dob     | deptno 
-------+---------+--------+------------+--------
    11 | Solomon |  10000 | 1964-06-25 |      5
    12 | Susan   |  10000 | 1982-02-26 |      5
(2 rows)

postgres=# select * from emp where empname not like 'S%';
 empno | empname  | salary |    dob     | deptno 
-------+----------+--------+------------+--------
    10 | Bill     |  12000 | 1977-01-16 |      5
    13 | Wendy    |   9000 | 1955-03-10 |      1
    14 | Benjamin |   7500 | 1980-02-28 |      1
    15 | Tom      |   7600 | 1969-01-09 |      1
    16 | Henry    |   8500 | 1972-04-19 |      2
    17 | Robert   |   9500 | 1979-05-31 |      2
    18 | Paul     |   7700 | 1960-07-04 |      2
(7 rows)
postgres=# select * from emp e where e.empname like '____';     
 empno | empname | salary |    dob     | deptno 
-------+---------+--------+------------+--------
    10 | Bill    |  12000 | 1977-01-16 |      5
    18 | Paul    |   7700 | 1960-07-04 |      2
(2 rows)

Some examples:

select ‘abc’ LIKE ‘abc’; → true

select ‘abc’ LIKE ‘a%’ ; → true

select ‘abc’ LIKE ‘_b_’; → true

select ‘abc’ LIKE ‘c’; → false

select ‘abc’ ILIKE ‘A%’ ;→ true

select ‘abc’ NOT ILIKE ‘A%’ ;→ false

Note : ILIKE is in-case sensitive

LIKE pattern matching always covers the entire string. Therefore, if it’s desired to match a sequence

anywhere within a string, the pattern must start and end with a percent sign.

The operator ~~ is equivalent to LIKE, and ~~* corresponds to ILIKE. There are also !~~ and !

~~* operators that represent NOT LIKE and NOT ILIKE, respectively. All of these operators are

PostgreSQL-specific.

~~LIKE
~~*ILIKE
!~~NOT LIKE
!~~*NOT ILIKE
postgres=# select * from emp e where e.empname ~~ '____'; 
 empno | empname | salary |    dob     | deptno 
-------+---------+--------+------------+--------
    10 | Bill    |  12000 | 1977-01-16 |      5
    18 | Paul    |   7700 | 1960-07-04 |      2
(2 rows)

postgres=# select * from emp e where e.empname ilike 's%';
 empno | empname | salary |    dob     | deptno 
-------+---------+--------+------------+--------
    11 | Solomon |  10000 | 1964-06-25 |      5
    12 | Susan   |  10000 | 1982-02-26 |      5
(2 rows)

postgres=# select * from emp e where e.empname ~~* 's%';
 empno | empname | salary |    dob     | deptno 
-------+---------+--------+------------+--------
    11 | Solomon |  10000 | 1964-06-25 |      5
    12 | Susan   |  10000 | 1982-02-26 |      5
(2 rows)

Regular Expressions or REGEXP

POSIX regular expressions

~ ( String matches regular expression, case sensitively)select ‘google’ ~ ‘go*gle’;→ true
~* ( String matches regular expression, in-case sensitively)select ‘google’ ~* ‘Go*gle’; → true
!~ ( String does not matches regular expression, case sensitively)select ‘google’ !~ ‘moogle’; → true
!~* ( String does not matches regular expression, in-case sensitively)select ‘google’ !~* ‘Moogle’; → true

‘abcd’ ~ ‘bc’ true

‘abcd’ ~ ‘a.c’ true — dot matches any character

‘abcd’ ~ ‘a.*d’ true — * repeats the preceding pattern item

‘abcd’ ~ ‘(b|x)’ true — | means OR, parentheses group

‘abcd’ ~ ‘^a’ true — ^ anchors to start of string

‘abcd’ ~ ‘^(b|c)’ false — would match except for anchoring

OperatorExample
substringselect substring(‘Google’ from 2 for 3); → oog
select substring(‘Google’ from 3); → ogle
select substring(‘Google’ for 2); → Go
regexp_matchselect regexp_match(‘google.com’, ‘google’); → {google}
select regexp_match(‘google.com’, ‘(google)(com’); → {google,inc}
regexp_replaceselect regexp_replace(‘googleinc’, ‘inc’,’.com’); → google.com
regexp_split_to_arrayselect regexp_split_to_array(‘google inc’, ‘\s+’); → {google,inc}
regexp_split_to_tableselect regexp_split_to_table(‘google inc’, ‘\s+’); → google,
in
ValueDescription
(re)(where re is any regular expression) matches a match for re, with the match noted for possible reporting
(?:re)as above, but the match is not noted for reporting (a “non-capturing” set of parentheses) (AREs only)
.matches any single character
\k(where k is a non-alphanumeric character) matches that character taken as an ordinary character, e.g., \ matches a backslash character
\cwhere c is alphanumeric (possibly followed by other characters) is an escape
{when followed by a character other than a digit, matches the left-brace character
xwhere x is a single character with no other significance, matches that character
*a sequence of 0 or more matches of the atom
+a sequence of 1 or more matches of the atom
?a sequence of 0 or 1 matches of the atom
{m}a sequence of exactly m matches of the atom
{m,}a sequence of m or more matches of the atom
(m,n}a sequence of m through n (inclusive) matches of the atom; m cannot exceed n
*?non-greedy version of *
+?non-greedy version of +
??non-greedy version of ?
(m}?non-greedy version of {m}
(m,}?non-greedy version of {m,}
(m,n}?non-greedy version of {m,n}
^matches at the beginning of the string
$matches at the end of the string
(?=re)positive lookahead matches at any point where a substring matching re begins
(?!re)negative lookahead matches at any point where no substring matching re begins
(?<=re)positive lookbehind matches at any point where a substring matching re ends
(?<!re)negative lookbehind matches at any point where no substring matching re ends
EscapeDescription
\aalert (bell) character
\bbackspace
\Bsynonym for backslash () to help reduce the need for backslash doubling
\cX(where X is any character) the character whose low-order 5 bits are the same as those of X, and whose other bits are all zero
\ethe character whose collating-sequence name is ESC, or failing that, the character with octal value 033
\fform feed
\nnew line
\rcarriage return
\thorizontal tab
\uwxyz(where wxyz is exactly four hexadecimal digits) the character whose hexadecimal value is 0xwxyz
\Ustuvwxyz(where stuvwxyz is exactly eight hexadecimal digits) the character whose hexadecimal value is 0xstuvwxyz
\vvertical tab
\xhhh(where hhh is any sequence of hexadecimal digits) the character whose hexadecimal value is 0xhhh (a single character no matter how many hexadecimal digits are used)
\0\0 the character whose value is 0
\xy(where xy is exactly two octal digits, and is not a back reference) the character whose octal value is 0xy
\xyz(where xyz is exactly three octal digits, and is not a back reference) the character whose octal value is 0xyz
\dmatches any digit, like [[:digit:]]
\smatches any whitespace character, like [[:space:]]
\wmatches any word character, like [[:word:]]
\Dmatches any non-digit, like [^[:digit:]]
\Smatches any non-whitespace character, like [^[:space:]]
\Wmatches any non-word character, like [^[:word:]]
\Amatches only at the beginning of the string
\mmatches only at the beginning of a word
\Mmatches only at the end of a word
\ymatches only at the beginning or end of a word
\Ymatches only at a point that is not the beginning or end of a word
\Zmatches only at the end of the string

SIMILAR TO

string SIMILAR TO pattern [ESCAPE escape-character]

string NOT SIMILAR TO pattern [ESCAPE escape-character]

The SIMILAR TO operator returns true or false depending on whether its pattern matches the given

string

| → denotes alternation (either of two alternatives).
* → repetition of the previous item zero or more times.
+ → repetition of the previous item one or more times.
? → repetition of the previous item zero or one time
{m} → repetition of the previous item exactly m times.
{m,} → repetition of the previous item m or more times.
{m,n} → repetition of the previous item at least m and not more than n times.

Some examples:

select ‘abc’ SIMILAR TO ‘abc’; → true

select ‘abc’ SIMILAR TO ‘a’; → false

select ‘abc’ SIMILAR TO ‘%(b|d)%’; → true

select ‘abc’ SIMILAR TO ‘(b|c)%’ ; → false

select ‘-abc-‘ SIMILAR TO ‘%\mabc\M%’ ; → true

select ‘xabcy’ SIMILAR TO ‘%\mabc\M%’ ; → false

select substring(‘foobar’ similar ‘%#”o_b#”%’ escape ‘#’) ; → oob

select substring(‘foobar’ similar ‘#”o_b#”%’ escape ‘#’) ; → NULL

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: