This example demonstrates how you can generate a regular expression that matches two or more unrelated bits of text in a file, regardless of where or how many times each bit of text occurs in the file. The bits of text are complex, meaning multiple RegexMagic patterns must be placed in sequence to match them. You can find this example as “Fields: alternation sequence” in the RegexMagic library.
For this example, we’ll create a regex to finds all numbers and all email addresses in a file, with the numbers between square brackets and the email addresses between angle brackets. The regex matches should include the delimiters. To do this, we create a regex that matches a number between square brackets or an email address between angle brackets. When you repeat this regex using the “find all” command in an application or programming languages, you will get a list of all the numbers and email addresses with their delimiters.
My favorite number is [42]. You can email me at <joe@fortytwo.com>. Other nice numbers are [17], [382], and [794]. <joefortytwo@gmail.com> is my alternative email address.
. RegexMagic automatically detects the correct “
. RegexMagic automatically detects the correct “integer” pattern for this field.
. Again we get the “literal text” pattern we want.
with “kind of field” set to “alternation”. The first alternative is a new field
, with “kind of field” set to “sequence”. The 3 fields we marked previously are placed under the sequence field, with their field numbers changed to
,
, and
. The new field to match the angle bracket is field
, which is added as the second alternative under field
.
, RegexMagic assumes that we always want the email address to follow immediately after the angle bracket. To accomplish this, RegexMagic adds a new field
, with “kind of field” set to “sequence”. The field for the angle bracket becomes field
as the first field in the sequence. Field
is the new field for the email address.
for closing angle bracket. Again, RegexMagic assumes that we always want the angle bracket to follow the email address, so field
becomes the third field under sequence field
.
# 1. One of the fields 2 to 6 # 2. Fields 3 to 5 in sequence # 3. Literal text \[ # 4. Integer [0-9]+ # 5. Literal text \] | # 6. Fields 7 to 9 in sequence # 7. Literal text < # 8. Email address [!#$%&'*+./0-9=?_`a-z{|}~^-]+@[.0-9a-z-]+\.[a-z]{2,63} # 9. Literal text >
Required options: Case insensitive; Free-spacing.
Unused options: Dot doesn’t match line breaks; ^$ don’t match at line breaks; Numbered capture.
My favorite number is [42]. You can email me at <joe@fortytwo.com>. Other nice numbers are [17], [382], and [794]. <joefortytwo@gmail.com> is my alternative email address.