Welcome Guest

Pages: 1
Variables in Regular Expressions
qbitPostOctober 3, 2017, 15:32
Newbie
Posts: 4
Registered:
October 3, 2017, 18:59
Normal topicVariables in Regular Expressions

From my reading of the documentation, I believe that variables (both numbered and named) may be inserted into the Regular expression field in the Regular Expression action.

In this excerpt I'm extracting capture groups 2 to 4 but only the commented-out version works. Have I read the book wrong?
Image

2MR2PostOctober 3, 2017, 16:14
Avatar photo
Administrator
Posts: 2084
Registered:
August 23, 2012, 19:27
Normal topicRe: Variables in Regular Expressions

It seems as if you have not set the "Set State" option on the Regular Expression statement. As such the state is not being modified. On various statements the "Set State" is optional as earlier incarnations did not set the state. In order to retain compatibility, the state setting is optional.

I ran the following:

Set named variable 'source' to "abc test def test"
Set named variable 'exp' to "test"
Set named variable 'rep' to "XXX"
Find first match for regular expression "\<exp>" in named variable 'source', replace with "\<rep>" to named variable 'result', case insensitive, set state
Dump named variables. Show Action Test State

and get the following output:

Action test state = true

Named Variables

exp = test
rep = XXX
result = abc·XXX·def·test
source = abc·test·def·test

It might be a good idea to enable the "Set State" option on new statements. I'll add it to the next release.

2MR2PostOctober 3, 2017, 16:38
Avatar photo
Administrator
Posts: 2084
Registered:
August 23, 2012, 19:27
Normal topicRe: Variables in Regular Expressions

Never mind. I've tried replicating your code and it is not working. I'm looking at it.

2MR2PostOctober 3, 2017, 17:11
Avatar photo
Administrator
Posts: 2084
Registered:
August 23, 2012, 19:27
Normal topicRe: Variables in Regular Expressions

Okay I see what's going on. I have to figure out a way around the issue. From the documentation:

All regular expression meta characters which may be inserted by the substitution will be properly escaped.

That means the substituted text is coerced to be a literal. However in your case you want the substituted text to be a regexp. It's failing because the literal pattern is not being matched.

I have to come up with a way of handling both cases. I guess most people have been using and wanting the literal insertion as it is a convenient means of escaping the regexp meta characters.

In order not to break anything I'll have to introduce a new "don't escape" sequence to introduce at the start of the expression to tell me not to regexp-escape the contents of the named variable.

I'll get back to you with my solution.

2MR2PostOctober 3, 2017, 17:27
Avatar photo
Administrator
Posts: 2084
Registered:
August 23, 2012, 19:27
Normal topicRe: Variables in Regular Expressions

The solution is easy if not terribly aesthetic. The current method of interpreting and escaping the content cannot change as far too many actions will break.

Currently the only Yate escape sequences which can be including in the template or replace fields is a \V or \<namvar> sequence. There may be multiple sequences in each field.

I'm proposing that the initial mode is 'do escaping'. If you include a \L (for literal) in the field, all subsequent \V or \<> sequence will be inserted as literal unescaped data. If a \E is found, escaping will be turned on again. Using these two new escape sequences you can control on a per replacement basis if escaped or non escaped text is inserted.

In your code I changed the regexp line to be:

Find first match for regular expression "\L\<PrestoPattern>" in named variable 'PrestoName', replace with "\<loop>" to named variable 'prestoReturn', set state

... and all seems to work as expected. 'loop' did not have to be escaped as it only contains 2, 3 or 4 which was not being escaped.

Let me know if this makes sense to you and I'll get you something.

qbitPostOctober 4, 2017, 06:33
Newbie
Posts: 4
Registered:
October 3, 2017, 18:59
Normal topicRe: Variables in Regular Expressions

This seems to work perfectly and your response time is unbelievable. I do see now why you may think it's not terribly aesthetic, however, as it's a special-case escape sequence just for this one Action field...

I guess it all hinges upon whether the \E sequence is necessary. My initial feeling is that an invocation of the Regular Expression Action will be one of two types: either a 'hard-core' regular expression with escape sequences, etc., as above in named variable 'PrestoPattern', or a more vanilla one like that in your first reply above. If that is indeed the case, then it may be more elegant to just supply a check box against the Regular expression input box: Literal either On or Off for the entire Action.

But I personally quite like the option of \E...

Users need to be aware that with this new capability, backslashes themselves must be escaped in variables that supply the regular expression, so an alternative for 'PrestoPattern' above is now "^(\\d{2}) - (.*?) - (.*?) - (.*?)$" (\d is shorthand for any digit).

Many thanks for this fast and productive response!

2MR2PostOctober 4, 2017, 08:20
Avatar photo
Administrator
Posts: 2084
Registered:
August 23, 2012, 19:27
Normal topicRe: Variables in Regular Expressions

I forgot that \e and \E are valid regexp escape sequences. SO, I will be going with checkbox(s) as opposed to the \L and \E which I originally did.

Users need to be aware that with this new capability, backslashes themselves must be escaped in variables that supply the regular expression, so an alternative for 'PrestoPattern' above is now "^(\\d{2}) - (.*?) - (.*?) - (.*?)$" (\d is shorthand for any digit).

This should not be the case as that's why it didn't work before. However, you are correct in that the \d does not work without the extra escape. It should not be required. I will take a look this morning.

qbitPostOctober 4, 2017, 08:32
Newbie
Posts: 4
Registered:
October 3, 2017, 18:59
Normal topicRe: Variables in Regular Expressions

Inserting "\d" in the regex variable substitutes the date... This behaviour must be overridden so the regular expression sees "\d". To do that, you need to escape the "\" -> "\\d".

qbitPostOctober 4, 2017, 08:48
Newbie
Posts: 4
Registered:
October 3, 2017, 18:59
Normal topicRe: Variables in Regular Expressions

So, yes, not all backslashes need to be escaped: just those that introduce Yate escape sequences that you don't want expanded.

Pages: 1
Mingle Forum by Cartpauj | Version: 1.1.0beta | Page loaded in: 0.023 seconds.