Application icon

Regular Expression

This statement is used to match a regular expression against a field.

The regular expressions are based on the ICU implementation. The ICU regular expressions are described here.

You specify a source and destination field. The same field may be specified for both if desired. The source and destination fields must both be metadata fields or named variables. If named variables are specified the fields may contain any of the escape sequences described in Escape Sequences. When named variables are specified, the statement is only executed once, regardless of the execution mode.

The Regular expression field must resolve at runtime to a non empty string.

The Replace template field contains the template used when replacing a matched pattern.

In both the Regular expression and Replace template fields, the following escape sequences will have their appropriate values substituted prior to compiling the regular expression:

\L
disable escape sequence replacement
\v#
insert the contents of a track variable
\<name>
insert the contents of a named variable

Regular expression templates only process \\ and \$ escape sequences. In order to make it easier to use templates the statement will also escape the following regular expression sequences in the template field:

\n
newline character
\r
carriage return
\t
tab character
\uhhhh
unicode character represented by four hexadecimal digits

All other standard escape sequences will be ignored and passed through to the regular expression parser.

Typically, all regular expression meta characters which may be inserted by the substitution will be properly escaped so that the inserted text is treated as a sequence of literal characters. If for any reason you want to save the actual regular expression sequences to a variable you must tell the application that you do not want the inserted variable content to be escaped. Both the regular expression and replace template fields have associated options called Do not escape inserted variable contents. The term variable applies to both track and named variables.

Assume you are matching a sequence of three digits and you want to specify that the regular expression is in Variable 1. If you assign Variable 1 via:

Set Variable 1 to (\d{3})

you will not have the desired result as Yate will insert the date for the \d sequence.

Set Variable 1 to \L(\d{3})

will work as escaping will have been disabled.

At this point Variable 1 will contain the literal \d{3} sequence. If you specify the regular expression as \v1 you must enable the associated Do not escape inserted variable contents option in order to avoid escaping the already valid escape sequence after insertion.

You specify whether you want to match all occurrences, only the first or only the last.

You can select case insensitivity.

There are four functions available:

Replace
The matches are replaced with the evaluated Replace template. The destination field will contain either the initial or modified source field.
Return Matches
The matches are returned in the destination. If more than one match was made, the returned matches will be separated by the default list delimiter (\~). This function is useful when you want to extract information from a field.
Return Ranges
Each match is returned as as range specified as location,length. Note that the locations are relative to the source string after variable escape sequences have been replaced. If more than one match was made, the returned ranges will be separated by the default list delimiter (\~).
Return Ranges+
This is the same as Return Ranges, except that capture group ranges are also returned. For every displayed range the format is:

    range of match{/range of capture group 1}{/range of capture group 2}...

You can test the validity of the regular expression field via the Validate button. This simply tests if the regular expression can be compiled. Any referenced track or named variables are treated as being a single space.

You can actually test the regular expression against sample text and variable contents via the preview button. A panel will be displayed which allows for the testing and the setting of all values except for the source and destination of this statement. Note that the test values are retained as long as the action is open. More information on the regular expression test panel can be found here.

The Regular expression and Template fields have a context menu submenu to assist in the entry of relevant escape sequences. Note that only the most common sequences are displayed.

When the function is Replace, the action test state will be set to true if the destination is different than the source and false if they are the same. The other three functions set it to reflect matches or ranges being returned.

Note that the Replace statement also can replace via a regular expression. It has far less options but it can process multiple fields at a time.



More information on regular expressions may be found at:

Regular Expression Metacharacters

Regular Expression Operators

Regular Expression Replace Template Format

Regular Expression Flag Options

Information on alternate means of parsing or scanning

File to Tag From Content

Find and Remove

Replace

Scanner

List Statements