Understanding Regular Expressions (Regex)
Lets look at how you can manipulate strings with regex.
Introduction
Regular expressions, commonly known as regex, are powerful tools we use for pattern matching within strings when we are trying to perform operations on a string.
Some of the useful ways to use it include: search, edit, and manipulate text based on specific patterns, making tasks like validation, searching, and text processing more efficient.
This article aims to explain regex categorically and in simple terms, providing examples and use cases for each category, complete with code examples and results in JavaScript. I created this as reference for me to look on when I need to remember it & I hope it helps you too.
1. Basic Characters and Metacharacters
1.1. Literal Characters
Description: Literal characters are the simplest form of regex. They match exactly what they represent.
Example:
const text = "The cat sat on the mat.";
const pattern = /cat/;
const match = text.match(pattern);
console.log(match ? match[0] : "No match");
Result:
cat
Use Case: Finding specific words in a text.
1.2. Metacharacters
Description: Metacharacters are characters with special meanings in regex.
Examples:
.
(dot): Matches any single character except newline.const text = "The cat sat on the mat."; const pattern = /c.t/; const match = text.match(pattern); console.log(match);
Result:
['cat']
\d
: Matches any digit (0-9).const text = "abc123"; const pattern = /\d/g; const match = text.match(pattern); console.log(match);
Result:
['1', '2', '3']
Use Case: General pattern matching and validation.
2. Character Classes
2.1. Predefined Character Classes
Description: Character classes represent sets of characters.
Examples:
\w
: Matches any word character (alphanumeric + underscore).const text = "hello_123"; const pattern = /\w/g; const match = text.match(pattern); console.log(match);
Result:
['h', 'e', 'l', 'l', 'o', '_', '1', '2', '3']
\s
: Matches any whitespace character.const text = "hello world"; const pattern = /\s/g; const match = text.match(pattern); console.log(match);
Result:
[' ']
Use Case: Validating and parsing text inputs.
2.2. Custom Character Classes
Description: Custom character classes are defined by enclosing characters in square brackets.
Examples:
[aeiou]
: Matches any vowel.const text = "hello world"; const pattern = /[aeiou]/g; const match = text.match(pattern); console.log(match);
Result:
['e', 'o', 'o']
[0-9]
: Matches any digit (same as\d
).const text = "abc123"; const pattern = /[0-9]/g; const match = text.match(pattern); console.log(match);
Result:
['1', '2', '3']
Use Case: Filtering specific sets of characters.
3. Quantifiers
3.1. Basic Quantifiers
Description: Quantifiers specify how many instances of a character or group must be present.
Examples:
*
: Matches 0 or more occurrences.const text = "aaaab"; const pattern = /a*/g; const match = text.match(pattern); console.log(match);
Result:
['aaaa', '']
+
: Matches 1 or more occurrences.const text = "aaaab"; const pattern = /a+/g; const match = text.match(pattern); console.log(match);
Result:
['aaaa']
?
: Matches 0 or 1 occurrence.const text = "abc"; const pattern = /a?/g; const match = text.match(pattern); console.log(match);
Result:
['a', '']
Use Case: Defining flexible patterns in text.
3.2. Range Quantifiers
Description: Range quantifiers specify a specific number of occurrences.
Examples:
{n}
: Matches exactly n occurrences.const text = "aaaab"; const pattern = /a{3}/g; const match = text.match(pattern); console.log(match);
Result:
['aaa']
{n,}
: Matches at least n occurrences.const text = "aaaab"; const pattern = /a{2,}/g; const match = text.match(pattern); console.log(match);
Result:
['aaaa']
{n,m}
: Matches between n and m occurrences.const text = "aaaab"; const pattern = /a{2,4}/g; const match = text.match(pattern); console.log(match);
Result:
['aaaa']
Use Case: Controlling the number of repetitions in patterns.
4. Anchors
4.1. Start and End Anchors
Description: Anchors are used to match positions within a string.
Examples:
^
: Matches the start of a string.const text = "The quick brown fox."; const pattern = /^The/; const match = text.match(pattern); console.log(match);
Result:
['The']
$
: Matches the end of a string.const text = "The quick brown fox."; const pattern = /fox\.$/; const match = text.match(pattern); console.log(match);
Result:
['fox.']
Use Case: Ensuring patterns appear at specific positions.
4.2. Word Boundaries
Description: Word boundaries match positions between a word character and a non-word character.
Examples:
\b
: Matches a word boundary.const text = "The cat sat."; const pattern = /\bcat\b/; const match = text.match(pattern); console.log(match);
Result:
['cat']
\B
: Matches a non-word boundary.const text = "The category."; const pattern = /\Bcat\B/; const match = text.match(pattern); console.log(match);
Result:
['cat']
Use Case: Precise word matching.
5. Groups and Alternations
5.1. Capturing Groups
Description: Groups allow parts of a regex to be treated as a single unit.
Example:
()
: Denotes a capturing group.
Result:const text = "The cat sat on the mat."; const pattern = /(cat)/; const match = text.match(pattern); console.log(match ? match[1] : "No match");
cat
Use Case: Extracting specific parts of a match.
5.2. Alternation
Description: Alternation matches one pattern or another.
Example:
|
: Denotes alternation (OR).
Result:const text = "The cat and the dog."; const pattern = /cat|dog/g; const match = text.match(pattern); console.log(match);
['cat', 'dog']
Use Case: Matching multiple patterns.
Conclusion
Regular expressions are versatile tools for text processing, providing powerful capabilities to match, search, and manipulate strings based on specific patterns.
By understanding and utilizing various regex categories, developers can efficiently handle complex text-based tasks. From simple character matching to advanced pattern definitions, regex offers a wide range of functionalities that are essential in modern programming. By mastering regex, you can significantly enhance your ability to work with text data, making your code more robust and efficient.
Happy Coding