Application icon

Duplicate Credit Names

When you have large collections it is quite easy to have different representations of credit names. This may include differences such as the following:

Differences in the number and location of spaces
W.C. Clark vs. W.C.Clark

Capitalization differences
Delbert McClinton vs. Delbert Mcclinton

Diacritic mark differences
Stéphane Grappelli vs. Stephane Grappelli

Differences in the use of leading articles
The Rolling Stones vs. Rolling Stones

Differences in suffixes
Count Basie vs. Count Basie Orchestra

This suite of actions is designed to help you identify, resolve and make changes.

The leading article detection extracts its list of valid articles from the Natural Sort Exception set which is defined in Preferences - Exceptions. The suffix identifiers are extracted from the Sort Form Ignore Suffix set which is defined in Preferences - Replacements. Note that in this set items with a To field of + are ignored. The + items represent suffixes which might be part of a name such as III or jr.

Using the above sets the duplicate identification process reduces credit names to a slug where everything is lowercase; diacritic marks are removed; all punctuation is removed; leading articles are removed; suffixes are removed; all spaces are removed. Any credit name which matches the same slug is considered a duplicate.

The workflow is divided into three steps and a configuration.


Duplicate Credit Names: Configuration

This action can be run at any time and is automatically run, if required information is missing, by the Step 1 and 3 actions. You can also force a full re-configuration by holding down the Option key when starting the Step 1 and Step 3 actions.

The configuration process accumulates the following information:

Fields
You can choose to analyze any combination of Album Artist, Artist, Composer, Conductor, Involved People, Lyricist, Musician Credits, Original Artist, Original Lyricist and Remixer. You call also select any custom field which is defined as being a credit. If you enable Automatically select all custom fields which are credits, custom fields (which are credits) created after the configuration process will automatically be added to the list of fields to be analyzed. Note that this means deselecting a custom field has no effect. Regardless of the checkbox setting at least one field must be selected.

Suffixes
The processing of suffixes is optional. If you ignore them, Count Basie and The Count Basie Orchestra will be considered distinct.

Decomposition
When processing a field other than Involved People or Musician Credits, a test is made to determine if the field contains a multi value delimiter sequences (typically ;;;). If it does, each delimited item in the field is treated as a unique item.

If a field does not contain a multi value delimiter you have the option of treating the field as a single item or you can optionally decompose the field based on the Preferences - Lists - Advanced Settings - Non Standard Delimiters. This decomposition enables the duplicate test to perform a more detailed search. If you store your names as J.J. Cale & Eric Clapton as opposed to J.J. Cale;;;Eric Clapton you should ensure that the non standard delimiters are configured and enable decomposition. If you do not J.J. Cale & Eric Clapton would be considered a single name.

When decomposing, a field's list association will be used to determine if a name should not be decomposed. For example if an Artist field contains Crosby, Stills, Nash & Young and an associated Artist list contains it as well, it will not be decomposed regardless of the presense of ',' and '&' delimiters.

Write delimiter
If you are decomposing fields you have to specify which delimiter should replace the ;;; sequences in order to recompose the field if changes are made. Spaces are significant. This means that you can specify a write delimiter of comma followed by a space.

Last delimiter
If you are decomposing fields you can optionally specify a different last delimiter for a sequence of components to be used when recomposing after making changes. With a write delimiter of ', ' and a last delimiter of ' & ', artist 1;;;artist 2;;;artist 3 would be recreated as artist1, artist2 & artist 3.

Involved People and Musician Credits
When changes are applied to these fields, the resultant order cannot be guaranteed. You can optionally specify how you want the fields to be sorted : None, By credit, By People or By Both. You can also specify if you want to merge names with the same credit into a single line.

Duplicate Credit Names Step 1: Get Duplicate Credit List

This action may be run in immediate mode or via the Batch Processor. It accumulates a list of duplicates which are saved to a specified file. When not running via the Batch Processor you can choose to run Step 2 when done.


Duplicate Credit Names Step 2: Resolve Duplicates

This action iterates through the found set of duplicates in a specified file. For each duplicate you can choose a matched representation from a menu or you may edit and change the name. You can choose to resolve, skip or ignore any item. When you resolve an item it is removed from the duplicates list and the resolved information is retained. When you skip an item it will be repeated. When you ignore an item it is removed from consideration.

You can run this action at your leisure. The file is saved whenever you select Quit in the action. When there are no duplicates remaining you can choose to run Step 3.


Duplicate Credit Names Step 3: Apply Duplicate Resolutions

This action may be run in immediate mode or via the Batch Processor. It applies the resolved items wherever necessary to all active files. Note that changes are not automatically saved. When running through the Batch Processor and Verbose Log is enabled, changes will be logged.



Content List, Requirements & History


Content List:

Action : Duplicate Credit Names Step 1: Get Duplicate Credit List

Action : Duplicate Credit Names Step 2: Resolve Duplicates

Action : Duplicate Credit Names Step 3: Apply Duplicate Resolutions

Action : Duplicate Credit Names: Configuration

Action : Duplicate Credit Names Helper: Read File

Action : Duplicate Credit Names Helper: Correct V1 Settings

Requirements:

Yate v6.16

History:


Date Version Information
2021-07-04 v0.5 Release to Preview List.
2021-07-12 v0.9 Release to Preview List.
2021-07-20 v1.0 First general release.
2021-12-04 v1.1 Patched v1.0 settings naming issues.
2023-06-30 v1.2 Updated for Yate v6.16.

Download


Back to Yate Resources