Escaping regular expression characters in JavaScript
20th January 2006
JavaScript’s support for regular expressions is generally pretty good, but there is one notable omission: an escaping mechanism for literal strings. Say for example you need to create a regular expression that removes a specific string from the end of a string. If you know the string you want to remove when you write the script this is easy:
var newString = oldString.replace(/Remove from end$/, '');
But what if the string to be removed comes from a variable? You’ll need to construct a regular expression from the variable, using the RegExp constructor function:
var re = new RegExp(stringToRemove + '$');
var newString = oldString.replace(re, '');
But what if the string you want to remove may contain regular expression metacharacters—characters like $ or . that affect the behaviour of the expression? Languages such as Python provide functions for escaping these characters (see re.escape); with JavaScript you have to write your own.
Here’s mine:
RegExp.escape = function(text) {
if (!arguments.callee.sRE) {
var specials = [
'/', '.', '*', '+', '?', '|',
'(', ')', '[', ']', '{', '}', '\\'
];
arguments.callee.sRE = new RegExp(
'(\\' + specials.join('|\\') + ')', 'g'
);
}
return text.replace(arguments.callee.sRE, '\\$1');
}
This deals with another common problem in JavaScript: compiling a regular expression once (rather than every time you use it) while keeping it local to a function. argmuments.callee
inside a function always refers to the function itself, and since JavaScript functions are objects you can store properties on them. In this case, the first time the function is run it compiles a regular expression and stashes it in the sRE property. On subsequent calls the pre-compiled expression can be reused.
In the above snippet I’ve added my function as a property of the RegExp
constructor. There’s no pressing reason to do this other than a desire to keep generic functionality relating to regular expression handling the same place. If you rename the function it will still work as expected, since the use of arguments.callee
eliminates any coupling between the function definition and the rest of the code.
More recent articles
- Gemini 2.0 Flash: An outstanding multi-modal LLM with a sci-fi streaming mode - 11th December 2024
- ChatGPT Canvas can make API requests now, but it's complicated - 10th December 2024
- I can now run a GPT-4 class model on my laptop - 9th December 2024