Homelab, Linux, JS & ABAP (~˘▾˘)~
 

[JavaScript] Regular Expressions (regex)

These are my notes while doing the course JavaScript Algorithms and Data Structures on https://www.freecodecamp.org. I highly recommend it if you prefer to try things directly rather than watching videos.


Test Method

let myString = "Hello, World!";
let myRegex = /Hello/;
let result = myRegex.test(myString);

Match Literal Strings

Search for multiple patterns.

let petString = "Max has a pet cat.";
let petRegex = /dog|cat|bird|fish/; 
let result = petRegex.test(petString);

Ignore Case While Matching using the i flag.

let petString = "Max has a pet cat.";
let petRegex = /max/i;
let result = petRegex.test(petString); //result true

Extract Matches

"Hello, World!".match(/Hello/);

Note: The .match syntax is the “opposite” of the .test method.


Find More Than the First Match

Using the g flag.

let testStr = "Repeat, Repeat, Repeat";
let repeatRegex = /Repeat/g;
testStr.match(repeatRegex); // Returns ["Repeat", "Repeat", "Repeat"]

Note: You can have multiple flags on your regex like /search/gi


Match Anything with Wildcard Period

The wildcard character: .

let exampleStr = "Let's have fun with regular expressions!";
let unRegex = /.un/;
let result = unRegex.test(exampleStr);

Match Single Character with Multiple Possibilities

let bigStr = "big";
let bagStr = "bag";
let bugStr = "bug";
let bogStr = "bog";
let bgRegex = /b[aiu]g/;
bigStr.match(bgRegex); // Returns ["big"]
bagStr.match(bgRegex); // Returns ["bag"]
bugStr.match(bgRegex); // Returns ["bug"]
bogStr.match(bgRegex); // Returns null

Inside a character set, you can define a range of characters to match using a hyphen character: -, i.e. to match lowercase letters a through e you would use [a-e] or to match any number through 0 to 5 use [0-5].
To create a negated character set, you place a caret character (^) after the opening bracket and before the characters you do not want to match. For example, /[^aeiou]/gi
Outside of a character set, the caret is used to search for patterns at the beginning of strings. /^firstWord/.
You can search the end of strings using the dollar sign character $ at the end of the regex. /lastWord$/


Match Characters that Occur Zero|One or More Times

Match a character (or group of characters) that appears:
one or more times in a row using the + character. For example, /a+/g would find a match in "aabc" and return ["aa"].
zero or more times in a row using the * character. For example, /go*/g would find a match in "gooooooal" and return ["goooooo"], but only return ["g"] in "guuuuuuual".


Find Characters with Lazy Matching

Finds the smallest possible part of the string that satisfies the regex pattern with ?. For example "titanic" matched against the adjusted regex of /t[a-z]*?i/ returns ["ti"].


Match All Letters and Numbers

The shortcut \w is equal to [A-Za-z0-9_]. This character class matches upper and lowercase letters plus numbers. Note, this character class also includes the underscore character (_).

let quoteSample = "The five boxing wizards jump quickly.";
let alphabetRegexV2 = /\w/g; 
let result = quoteSample.match(alphabetRegexV2).length; // result 31

You can search for the opposite of the \w with \W. This shortcut is the same as [^A-Za-z0-9_].


Match All Numbers

The shortcut to look for digit characters is \d, with a lowercase d. This is equal to the character class [0-9], which looks for a single character of any number between zero and nine.
The shortcut to look for non-digit characters is \D. This is equal to the character class [^0-9].


Exercise: Restrict Possible Usernames

You need to check all the usernames in a database. Here are some simple rules that users have to follow when creating their username.

  1. Usernames can only use alpha-numeric characters.
  2. The only numbers in the username have to be at the end. There can be zero or more of them at the end. Username cannot start with the number.
  3. Username letters can be lowercase and uppercase.
  4. Usernames have to be at least two characters long. A two-character username can only use alphabet letters as characters.
let username = "JackOfAllTrades";
let userCheck = /^[a-z][a-z]+\d*$|^[a-z]\d\d$/i; 
let result = userCheck.test(username);

Match Whitespace

You can search for whitespace using \s. Will also match return, tab, form feed, and new line characters. Similar to [ \r\t\f\n\v].

let whiteSpace = "Whitespace. Whitespace everywhere!"
let spaceRegex = /\s/g;
whiteSpace.match(spaceRegex); // Returns [" ", " "]

Search for non-whitespace using \S, Similar to the character class [^ \r\t\f\n\v]


Quantity Specifiers

Quantity specifiers are used with curly brackets ({ and }). You put two numbers between the curly brackets – for the lower and upper number of patterns.
For example, to match only the letter a appearing between 3 and 5 times in the string "aaaah", your regex would be /a{3,5}h/.
To only specify the lower number of patterns, keep the first number followed by a comma. /a{3,}h/
To specify a certain number of patterns, just have that one number between the curly brackets. /ha{3}h/


Check for All or None

You can specify the possible existence of an element with a question mark ?

let american = "color";
let british = "colour";
let rainbowRegex= /colou?r/;
rainbowRegex.test(american); // Returns true
rainbowRegex.test(british); // Returns true

Positive and Negative Lookahead

Lookaheads are patterns that tell JavaScript to look-ahead in your string to check for patterns further along.
A positive lookahead will look to make sure the element in the search pattern is there, but won’t actually match it. A positive lookahead is used as (?=...)
A negative lookahead will look to make sure the element in the search pattern is not there. A negative lookahead is used as (?!...)
A practical use of lookaheads is to check two or more patterns in one string. Here is a (naively) simple password checker that looks for between 3 and 6 characters and at least one number:

let password = "abc123";
let checkPass = /(?=\w{3,6})(?=\D*\d)/;
checkPass.test(password); // Returns true

Reuse Patterns Using Capture Groups

You can search for repeat substrings using capture groups. Parentheses, ( and ), are used to find repeat substrings. To specify where that repeat string will appear, you use a backslash (\) and then a number. This number starts at 1 and increases with each additional capture group you use.

let repeatStr = "regex regex";
let repeatRegex = /(\w+)\s\1/;
repeatRegex.test(repeatStr); // Returns true
repeatStr.match(repeatRegex); // Returns ["regex regex", "regex"]

Note: Using the .match() method on a string will return an array with the string it matches, along with its capture group.


Use Capture Groups to Search and Replace

Search and replace text in a string using .replace() on a string. The inputs for .replace() is first the regex pattern you want to search for. The second parameter is the string to replace the match or a function to do something.

let str = "one two three";
let fixRegex = /(\w+)\s(\w+)\s(\w+)/; 
let replaceText = "$3 $2 $1"; 
let result = str.replace(fixRegex, replaceText); //result "three two one"

Exercise: Remove Whitespace from Start and End

/* my solution -> selecting the String */
let hello = "   Hello, World!  ";
let wsRegex = /(\s+)(\w+,\s\w+!)(\s+)/i; 
let result = hello.replace(wsRegex, "$2"); 

/* sample solution -> selecting the Whitespace */
let hello = "   Hello, World!  ";
let wsRegex = /^\s+|\s+$/g; 
let result = hello.replace(wsRegex, "");

Leave a Reply

Your email address will not be published. Required fields are marked *