File to Tag Template Editor

The Settings-File to Tag template editor is used to create and modify file to tag templates.

Tabs and newline characters may be used to provide formatting to make it easier to read the template. They are ignored when deploying the template. Space characters are meaningful and are displayed as centered-dot characters to make it easier to see them. The Format button can be used to apply formatting to the current template. If for any reason you want to remove the formatting and display the template as a single line of text, hold down the Shift key and click on the Format button.

There are two types of items in the File to Tag templates: text and tokens. Text sequences represent patterns used to separate fields. Fields are represented as tokens.

Tokens are inserted via the Add Token button or via the context menu. Holding down the Control key while entering an alphabetic character will display an abbreviated menu displaying only those menu items beginning with the specified character.

Most tokens correspond to Yate fields. The ❨Movement Number - Roman❩ token will attempt to extract a Roman numeral, convert it to an integer and save it to the Movement Number field.

You may have filenames which specify a disc-track sequence as ddtt or dtt where dd or d is a disc number and tt is a track number. The token named ❨{Disc}Track❩ can handle this sequence. If exactly three or four decimal digits are present, the extraction will be split directed to the Disc and Track fields. If not, ❨{Disc}Track❩ is treated as ❨Track❩.

The font size can be changed by Font>Bigger (⌘+) and Font>Smaller(⌘-)

You can preview and optionally apply the template against the current main window file selection via the Preview button. When designing a template it may not be convenient to have all potential representations of filenames available in the main window. If the Shift key is pressed when Preview is selected, the preview will be performed against test data that you enter. Every text line in the test data which is an absolute path to an audio file will be processed. Note that the specified audio paths do not have to represent existing files.

Each template has a unique identifying name and a field describing how a file's name is decomposed into specific tag fields. The template field consists of text and tokens. The text supplies stop points for an extracted field. The bulk of the tokens describe the name of the field where the extracted data will be placed.

Most sequences consist of a token followed by stop text. eg.

❨Track❩-❨Title❩

will correctly extract information from the following filenames:

01-name
1 - name
02- name

As there is no terminating text specified after the ❨Title❩ token, all text after the - is saved in ❨Title❩.

Once the scanning process reaches the end of all available text, nothing else is saved. If a filename of 01- was processed by the ❨Track❩-❨Title❩ template, the Title field would not be modified.

When scanning for text, leading and trailing spaces are ignored. However, specified spaces are significant. If you wish to extract a field terminated by a dash (-) or by space dash space, do not specify the spaces in the text delimiter. eg. with a template of ❨Track❩-❨Title❩, the following filenames will all be decomposed as expected:

01-name
1 - name
02- name

with a template of ❨Track❩ - ❨Title❩, only the second filename (1 - name) will be decomposed as expected.

You can specify that a section of the name be ignored by using the Ignore token. For example if your filenames have the format track-year-title and you only wish to extract the track and title, the following template will work: ❨Track❩-(Ignore)-(Title❩. Note that when a token stream ends with text, Ignore is assumed for anything following the trailing text. So, ❨Track❩- will only extract the Track field. Ignore is also assumed when text appears at the start of the template. A template of a template of -❨Title❩, will correctly extract the Title from a filename of 1-name.

Up until now the text delimiters represent a string. There are two special forms which can be used to skip characters from a supplied set. When a text pattern is specified as \*set, any character found in set will stop a field definition. The set consists of all characters after the \* sequence until the next token or formatting character. For example, if some of your filenames are track title and some are track-title, the template ❨Track❩\* -❨Title❩ will handle both. Note that there is a space between the * and -. This form will skip all occurrences of any of the characters in set. The previous template will also handle filenames formatted as track--title.

The \? form is almost identical to \* except that at most one non space character in set will be processed. The \? form would return an empty title with a template of ❨Track❩\? -❨Title❩ and a filename of track--title.

There is a third special escape sequence which is used to specify a centered-dot character. You cannot enter a centered-dot character unescaped as it is used to represent spaces for visibility reasons. The escape sequence is \. (backslash-period). As opposed to the \* and \? sequences, \. can appear anwhere in a text sequence.

If you are using File to Tag templates with a File to Tag from Content statement you can scan any text. As a convenience the escape sequences \n (newline) and \- (tab) are supported.

You can extract information from any folder in the full path to the file. The Folder Start token species the start of folder processing. Successive Folder Start tokens may be used to process folders higher up the path. Every phase of the File to Tag process works with a source string which is a portion of the full path to a file. Successive phases select path components from the right to the left.

eg. The full path to a file is: /users/me/artist-name/(2016) album-name/02-filename.mp3 and the template is:

At the start of the template all tokens and text apply to the last path component, the filename, with the filename extension removed. This section can be empty. In this example the source string will be 02-filename
❨Track❩\?- ❨Title❩: The Track field will be set to 02 and the Title field will be set to filename.
❨Folder Start❩: /02-filename.mp3 is removed from the path and the source becomes (2016) album-name.
(❨Year❩) ❨Album❩: The Year field will be set to 2016 and the Album field will be set to album-name.
❨Folder Start❩: /(2016) album-name is removed from the path and artist-name becomes the source.
❨Artist❩: The Artist field is set to artist-name.

Note that the filename section of a template and any Folder Start section, except for the last, can be empty. A template must have at least one metadata token. There has to be the potential of extracting something.

If you specify too many Folder Start tokens, they will be ignored. ie. if there is no corresponding path component, the extraction is stopped.

At this point everything discussed leads to a straightforward left to right scan where no decisions are made. There are four types of qualifier tokens which may be specified: Required, Optional, Number and Not Empty. Qualifiers may appear anywhere in the template with the following rules:

All qualifiers are reset by a Folder Start token.
All qualifiers ending in Once are reset after processing a text pattern.
A qualifer cannot be used to separate text items.

Required

The Required token states that any subsequent text pattern which cannot be matched, should cancel all processing and ensure that no fields are modified. The Not Required token, resets required state to not required. The Required Once token, enables the required state until the next text item is finished being processed.

❨Required❩❨Disc❩.❨Track❩ ❨Not Required❩❨Title❩ [❨Year❩]

The above example will not extract anything if the Disc and Track components cannot be located. However, the Year component is not required. If Year cannot be located, Title will be set to everything after the Track sequence. Note that the template could have been expressed as follows with no difference. The required state is only evaluated when a text match fails.

❨Required❩❨Disc❩.❨Track❩ ❨Title❩❨Not Required❩ [❨Year❩]

The Once form is convenient when you only want to mark one text match as required. The following two templates are equivalent:

❨Required❩❨Track❩-❨Not Required❩❨Title❩ [❨Year❩]
❨Track❩❨Required Once❩-❨Title❩ [❨Year❩]

Optional

The Optional qualifiers are used to mark a text pattern as optional. The same three forms are available: Optional, Not Optional and Optional Once.

❨Track❩❨Optional Once❩-❨Title❩

The above example will correctly extract from each of the following filenames:

01-name
name

An additional Optional qualifier is available. While Optional and Optional Once make a preceding token optional, Optional Stop makes all subsequent tokens optional.

❨Title❩❨Optional Stop❩ - ❨Artist❩ - ❨Genre❩

The above example will correctly extract from each of the following filenames:

name
name - artistName
name - artistName - genre

The item, name, is matched as Optional Stop causes all analysis to end when the space-space matching fails. The Title field is assigned the remaining text which is name.

Numeric

An attempt is always made to coerce the following fields to be integer values: BPM, Disc, Disc Count, Episode, Movement Number, Season, Track and Track Count. However, the actual text being processed may not be comxpletely numeric. For example 2 files - Sample.mp3 and the template ❨Track❩-❨Title❩ will extract a track number of 2 and a title of Sample. The 2 files sequence is converted to 2 when saved to he Track field.

You can use the Numeric qualifiers to validate that the text being extracted is completely numeric. The same three forms are available: Numeric, Not Numeric and Numeric Once. If the extracted text is matched and is not numeric, the result is equivalent to failing a Required condition. However, if Optional is also on, you can recover from a failed Numeric test. For example:

❨Track❩❨Optional Once❩❨Numeric Once❩-❨Title❩

The above template will successfully scan the following filenames:

01-name
name
text-name

Not Empty

By default, the extraction process has no issues with empty values being saved. This default state is represented by the Empty qualifier. You can elect to disallow empty values by specifying the Not Empty qualifier or you can do it for a single metadata token by using the Not Empty Once token. For example:

❨Not Empty❩❨Track❩-❨Title❩
❨Track❩-❨Title❩

The first template above will not match the following filename, while the second will.

01-

You cannot use an Optional qualifier to override a failed Not Empty condition.

Normalizing Unicode Characters

The presence of some Unicode characters may make it difficult to match character sequences. For example there are a number of Unicode equivalents for a dash (-) character. The Multi Field Editor, the Re-encode action statement and the Yate Transformations submenu, in a text field's context menu, have the ability to fold Unicode character equivalents. The function currently normalizes dash, single and double quote characters. If you want to do this character folding in a File to Tag template, include a ❨Fold Characters❩ token at the very start of the template.