When to use which RegExp function in JavaScript

Although the MDN pages do a good job in explaining what the different RegExp functions do exactly and what the differences between them are, they can be a little confusing if you know what you want to do, but not which function to call.

So here is a breakdown, grouped by what you want to do.
As is so often the case, I made this list mostly for myself, but I think other people may benefit from it too.

You simply want to know if a string contains a certain pattern

RegExp.test(String) returns true if the pattern can be found, false otherwise.

let str = 'The quick brown fox jumps over a lazy dog';
let result = /\w+o\w+/.test(str);
// result will be true

You want to know where in the string a pattern occurs

String.search(RegExp) returns the index, or -1 if not found.

let str = 'The quick brown fox jumps over a lazy dog';
let result = str.search(/\w+o\w+/);
// result will be 10

If there are multiple matches, it will return the index of the first one.

Retrieve the substring matched by the pattern

String.match(RegExp) and RegExp.exec(String) each return an array, the first element of which is the first match.
They return null if not found.

let str = 'The quick brown fox jumps over a lazy dog';
let result = str.match(/\w+o\w+/); if (result) result = result[0];
// result will be 'brown'

Count how many times the pattern occurs in the string

String.match(RegExp) on a regex with the g flag returns an array of matches (or null if not found).

So just take the length (or use 0 if the result is null). In this particular example, there are three matches and the outcome is [‘brown’, ‘fox’, ‘dog’];

let str = 'The quick brown fox jumps over a lazy dog';
let result = str.match(/\w+o\w+/g); result = result ?result.length :0;
// result will be 3

Retrieve a list of all substrings matched by the pattern

String.match(RegExp) on a regex with the g flag returns an array of matches (or null if not found).

If there are multiple matches in the string for the pattern, the returned array will contain all of them. In this particular case, the result will be [‘brown’, ‘fox’, ‘dog’];

let str = 'The quick brown fox jumps over a lazy dog';
let result = str.match(/\w+o\w+/g);
// result will be ['brown', 'fox', 'dog']

Retrieve the match and its capturing groups

String.match(RegExp) without the g flag and RegExp.exec(String)
each return an array, the first element of which is the first match and the following elements are the matches for the capturing groups (or the result is null if not found).

let str = 'The quick brown fox jumps over a lazy dog';
let result = str.match(/(\w+)o(\w+)/);
// result will be ['brown', 'br', 'wn'];

Retrieve all matches, the indexes at which they are found in the string and all their capturing groups

RegExp.exec(String) with the g flag returns an array with the info you want for the first match (or null if not found).
To get to the rest of the matches, you have to call the exec function repeatedly with the same RegExp variable, until it returns null. So this is a tad more work, but not a lot.

let str = 'The quick brown fox jumps over a lazy dog';
let rex = /(\w+)o(\w+)/g;
let allRes = [];
while ((result = rex.exec(str))!=null)
  allRes.push('n='+result.shift()+' i='+result.index+' g='+result.join('/'));
// allRes will be ['n=brown i=10 g=br/wn', 'n=fox i=16 g=f/x', 'n=dog i=38 g=d/g']

Note that this requires a RegExp variable, because it needs to remember the location at which it found its last result, which it starts off from on the next go through the loop. A regex literal, like result = /(\w+)o(\w+)/g.exec(str), won’t do; this would reinitialise the regex each time and so it would always return the first match.

Or, alternatively…

If you don’t want to remember all these different function calls, know that there is one function which has all these features built in: exec! That’s all you need to remember. Make sure to use the g flag.

let str = 'The quick brown fox jumps over a lazy dog';
let rex = /(\w+)o(\w+)/g;
let result = rex.exec(str);
// To test if the pattern occurs, return true here if the result is not null or false otherwise
// For the location of the pattern, return result.index if the result is not null, or else -1
// For the (first) matching substring, return result[0] if the result is not null
// Other results need some more code, like above
let allRes = [];
while (result!=null) {allRes.push(result); result = rex.exec(str);}
// Now to retrieve the number of matches, return allRes.length
// For the matches themselves, return allRes.map(el => el[0])
// etc. You get the idea.

That’s about it.
I want to close with a heads-up: this mechanism (fetching the next result if you call the function repeatedly while using the g flag) is also used by the test function. So if you use that in a loop for unrelated reasons, you may get unexpected results:

let str = 'The quick brown fox jumps over a lazy dog';
let rex = /(\w+)o(\w+)/g;
for (let i = 1; i<=10; ++i) {
  do stuff;
  if (rex.test(str)) do other stuff, but only if str contains rex;
  do more stuff;
}

This will behave the way you want the first three times through the loop, but it will fail after that!
Solution: don’t use the g flag, or call rex.test(str) once and put it in a variable to use later.

Advertisement

Published by

MrListerSir

Pim is a programmer who programs programs.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s