Macro: Using captured strings (from a replace) in subsequent
code?
Tony Balinski
ajbj at free.fr
Fri Dec 29 11:23:46 CET 2006
Quoting Randy Kramer <rhkramer at gmail.com>:
> Question:
>
> Within a macro, is there a (direct/simple) way to use the results of
> capturing
> parenthesis (from a replace statement) in subsequent code?
>
> I was hoping to do something like a = \1, b = \3, and c = \5; and then use
> those variables in the subsequent code.
>
> Background:
>
> I'm using nedit version 5.5.
...
> \n-{3}\+{2} (.*)\s*(\n-{3}\+{2} (.*?)\s*\n){0,1}(-{3}\+{2} (.*?)\s*\n)
> {0,1}
>
> I haven't written the final version of the replace string yet, but you
> can get the gist of it by looking at the desired record prefix, a sample
> of which is below.
Probably a good general way to do this is to precede your regex with .*? and
end it with .* unless your regex starts/ends at a line boundary, in which case
replace either or both with ^, $. The idea is to throw away prefix/suffix text
from your input string. You may need (?n<regex>) constructs for multiline
input. Then make a substitution string which looks like this (in macro):
"\n\\1\n\\2\n\\3\n\\4\n\\5".
Now call replace_in_string() using this and your regex, and split() the result
using "\n". You then access each piece directly using its index (index 0
accesses an empty string, the one preceding the first \n in the replacement).
You might want to use another separator than \n in your replacement if you
need to grab substrings that include \n itself.
Of course, if you want to do this repeatedly, you will have to reduce the
source data string each time, or make the lead-in subexpression (the .*?)
more complex to skip what's already been extracted.
Joerg's suggestion of having a subexpression array is an excellent one,
however. Doing things as described above is clumsy.
Regarding your regex expression, it looks to me you might be trying to do too
much in one go. Might it not be easier to extract the ---++ lines in a
string, then remove the ---++ prefixes, then deal with the lines left in the
extracted string, with a single step for each?
I'd also rewrite your regex like this, making the last optional line depend on
the previous one, swallow newlines at line end rather than line start, and use
non capturing paren groups:
^---\+\+ (.*?)\s*\n(?:---\+\+ (.*?)\s*\n(?:---\+\+ (.*?)\s*\n)?)?
The pieces that interest you would then be \1, \2, \3.
HTH,
Tony
More information about the Discuss
mailing list