Application icon

Substring

This function is used to extract a substring of a field or named variable. The extracted substring is saved to a track or named variable.

All supplied text fields may contain any of the escape sequences described in Escape Sequences. Escape Sequences are reevaluated for each processed file.

All indexes are zero based. ie. the first character in a string is at index 0.

An index field which evaluates to be less than zero is treated as relative from the end of the field.

A length field which evaluates to be a negative number is interpreted as to the end of the field.

There are six modes in which a substring can be extracted:

With Index & Length

The mode returns the characters from a string that are within the adjusted values of index and length.

examples with a field of abcde
index = 1, length = 2 --> bc
index = 2, length = -1 --> cde
index = -3, length = 2 of --> cd
index = 6, length = 1 of --> an empty string
index = -7, length = 3 of --> a

To Index

This mode only specifies an index. All characters up to but not including the character at the adjusted index are returned.

examples with a field of abcde
index = 3 --> abc
index = 1 --> a
index = 0 --> an empty string
index = -2 --> abc
index = 10 --> abcde

From Index

This mode only specifies an index. All characters from the adjusted value of the index to the end of the string are returned.

examples with a field of abcde
index = 3 --> de
index = 6 --> an empty string
index = -2 --> de
index = -10 --> abcde

With Range

This mode expects the range field to contain index,length. If the comma is not present at runtime the mode is equivalent to the From Index mode. If a comma is present and it is not followed by an integer, the mode is equivalent to the From Index mode. This mode is an efficient means of handling ranges returned by statements such as the Regular Expression statement. Note that unlike most other Yate numeric statements, an empty range will not be treated as 0. An empty range will fail returning the empty string.

examples with a field of abcde
range = 3 --> de
range = 3, --> de
range = 3,1 --> d
range = 2,10 --> cde
range = -1,3 --> e
range = -3,1 --> c
range = 10,2 --> an empty string
range = 1,-1 --> bcde
range = empty --> empty
Unicode Character

This function returns a single Unicode character at the specified index. Composed Unicode characters are handled correctly. Named variable Character Length will contain the length of the character extracted. If there is no character at the specified index, or the index is invalid, an empty string will be returned and Character Length will be 0. Note that only positive indexes are supported. ie. all indexes are from the left.
Set named variable 'test' to "abc😀def😂xyz"
Unicode character at index '3' of named variable 'test' -> named variable 'char'

will return 😀 in named variable char and 2 in named variable Character Length
Integer Value of Unicode Character

This function returns the numeric value of a single Unicode character at the specified index. Composed Unicode characters are handled correctly. Named variable Character Length will contain the length of the character extracted. If there is no character at the specified index, or the index is invalid, -1 will be returned and Character Length will be 0. Note that only positive indexes are supported. ie. all indexes are from the left.
Set named variable 'test' to "abc😀def😂xyz"
Integer value of Unicode character at index '3' of named variable 'test' -> named variable 'value'

will return 128512 in named variable value and 2 in named variable Character Length. Note that the previous snippet is equivalent to:

Set named variable 'test' to "abc😀def😂xyz"
Substring from index '3' of named variable 'test' -> named variable 'str'
Format named variable 'str' as integer value of Unicode character, save to named variable 'value'
Index of Composed Unicode Character

The index of the next composed Unicode character starting from the specified index is returned. Named variable Character Length will contain the length of the located character. The returned index will be -1 if no composed characters were found and Character Length will be 0. If the specified index was in the middle of a composed character the returned index can be a lower value than that which was passed. Note that only positive indexes are supported. ie. all indexes are from the left.
Set named variable 'test' to "abc😀def😂xyz"
Index of Composed Unicode character starting at index '0' of named variable 'test' -> named variable 'loc'

will return 3 in named variable loc and 2 in named variable Character Length

Please Read

See the Unicode Strings topic for additional information which effect the usage of this statement.