Python Pattern Matching With Regular Expressions

Search for a value

Some variable data type such as string, list, set and tuple allow you to search them by using the in keyword

Example

temp_list = [1, 2, 3] # Create a list with elements 1, 2, 3
temp_string = “Hello World!” # Create a string variable with value “Hello World!”
if 1 in temp_list: # Check if the number 1 exists in temp_list
    print(“Found number 1”) # If True, print this message
if “Hello” in temp_string: # Check if the substring “Hello” exists in temp_string
    print(“Found Hello”) # If True, print this message

temp_list = [1,2,3]
temp_string = "Hello World!"

if 1 in temp_list:
print("Found number 1")

if "Hello" in temp_string:
print("Found Hello")

Result

Found number 1
Found Hello

Check the length

You can use the len function to check the length

Example

mobile = “1112223333” # Create a string variable representing a mobile number
if len(mobile) == 10: # Check if the length of the mobile number is exactly 10
    print(“Mobile number length is correct”) # If True, print this message

mobile = "1112223333"

if len(mobile) == 10:
print("Mobile number length is correct")

Result

Mobile number length is correct

Check if Numeric

You can either use the .isdecimal method or loop the string character and check each one individually

Example

mobile = “1112223333” # Create a string variable representing a mobile number
if len(mobile) == 10: # Check if the mobile number has exactly 10 characters
    print(“Mobile number length is valid”) # If True, print this message
    if mobile.isdecimal(): # Check if all characters in the string are decimal digits (0-9)
        print(“Mobile number pattern is valid”) # If True, print this message

mobile = "1112223333"

if len(mobile) == 10:
print("Mobile number length is valid")
if mobile.isdecimal():
print("Mobile number pattern is valid")

Result

Mobile number length is valid
Mobile number pattern is valid

Or, you can loop each character and check if it’s number or not

Example

mobile = “1112223333” # Create a string variable representing a mobile number
numbers = “1234567890” # String containing all valid numeric digits
if len(mobile) == 10: # Check if mobile number has exactly 10 characters
    print(“Mobile number length is valid”) # Output message if length is valid
    for character in mobile: # Loop through each character in the mobile number
        if character in numbers: # Check if the character is a valid number
            print(character + ” is valid”) # Print a message for each valid character

mobile = "1112223333"
numbers = "1234567890"

if len(mobile) == 10:
print("Mobile number length is valid")
for character in mobile:
if character in numbers:
print(character + " is valid")

Result

Mobile number length is valid
1 is valid
1 is valid
1 is valid
2 is valid
2 is valid
2 is valid
3 is valid
3 is valid
3 is valid
3 is valid

Check by index

You can also use indexing to check a specific character or sub-string

Example

mobile = “111-222-3333” # Create a string variable representing a mobile number in the format XXX-XXX-XXXX
if len(mobile) == 12: # Check if the total length is 12 characters (including dashes)
    if mobile[3] == “-” and mobile[7] == “-“: # Check if the 4th and 8th characters are dashes
        if mobile[0:3].isdecimal() and mobile[4:7].isdecimal() and mobile[8:12].isdecimal(): # Check if the number parts are all digits: first three, middle three, last four
            print(“Mobile number is valid”) # If all conditions are met, print this message

mobile = "111-222-3333"

if len(mobile) == 12:
if mobile[3] == "-" and mobile[7] == "-":
if mobile[0:2].isdecimal() and mobile[4:6].isdecimal() and mobile[8:11].isdecimal():
print("Mobile number is valid")

Result

Mobile number is valid

Regex

Regex, or regular expression, is a language for finding a particular string based on a search pattern.

  • Characters
    • \d matches 0 to 9
      • \d\d\d\d with 1234567 returns 1234
      • \d+ with 1234567 returns 1234567
    • \w matches word character A to Z, a to z, 0 to 9, and _
      • \w\w with Hello! returns He, and ll
      • \w+ with Hello! returns Hello
    • \s matches white space character
    • . matches any character except line break
      • . with car returns c, a, and r
      • .* with car returns car
  • Character classes
    • [ ] for matching characters within the brackets
      • [abcd] matches a, b, c, or d
      • [a-d] matches a, b, c, or d (The – means to)
      • [^abcd] matches anything except a, b, c, or d (The ^ means negated character class)
      • [^a-d] matches anything except a, b, c, or d (The - means to, and ^ means negated character class)
  • Quantifiers
    • + one or more
      • [1-2] with 112233 returns 1, 1, 2, 2
      • [1-2]+ with 112233 returns 1122
    • * zero or more
      • 1*2* with 112233 returns 1122
    • {2} matches 2 times
      • 1{4} with 111111 returns 1111
  • Boundaries
    • ^ start of string
    • $ end of string
  • Normal
    • 123456 with 123456789 returns 123456
    • abcdef with abcdefghijklmnopqrstuvwxyz returns abcdef
      • Escape special characters using \

Importing Regex (re) Module

To use the regex module named re, you need to make it available to use by using the import statement

Example

import re # Import Python’s built-in regular expression (regex) module
print(dir(re)) # Print a list of all attributes, functions, and classes available in the ‘re’ module

import re
print(dir(re))

Result

['A', 'ASCII', 'DEBUG', 'DOTALL', 'I', 'IGNORECASE', 'L', 'LOCALE', 'M', 'MULTILINE', 'Match', 'Pattern', 'RegexFlag', 'S', 'Scanner', 'T', 'TEMPLATE', 'U', 'UNICODE', 'VERBOSE', 'X', '_MAXCACHE', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '__version__', '_cache', '_compile', '_compile_repl', '_expand', '_locale', '_pickle', '_special_chars_map', '_subx', 'compile', 'copyreg', 'enum', 'error', 'escape', 'findall', 'finditer', 'fullmatch', 'functools', 'match', 'purge', 'search', 'split', 'sre_compile', 'sre_parse', 'sub', 'subn', 'template']

Regex (.search)

You can use the .search method of re module to find a string based on regex pattern

Example

import re # Import the regular expression module
mobile = “111-222-3333” # Create a string variable representing a mobile number
if re.search(“\d\d\d-\d\d\d-\d\d\d\d”, mobile): # Search for the pattern XXX-XXX-XXXX using regex
    print(“Mobile number is valid”) # Print this message if the pattern matches

import re

mobile = "111-222-3333"

if re.search("\d\d\d-\d\d\d-\d\d\d\d",mobile):
print("Mobile number is valid")

Result

Mobile number is valid