WG: Problems with regex
Tony Balinski
ajbj at free.fr
Thu Mar 5 22:41:30 CET 2009
Quoting "David, Patrick (I/EG-711)" <patrick.david at audi.de>:
>
> Hello out there,
>
> I want to extend my macro library to make nedit able to deal with *.inp
> Files (this extension is used by the commercial Finite-Element-Method
> software Abaqus).
> First I want to teach nedit how to deal with comments in these files. A
> commented line starts with two asterisks (**). My macro line for
> automatic commenting is the following:
>
> replace_in_selection("^.*$", "** $", "regex")
This deletes what was on the line already. Use "** &" as the replace string to
keep that.
> which seems to work. My problem: how can I program a macro to remove the
> first 2 asterisks in a line?
> My first shot
>
> replace_in_selection("(^[ \\t]*[\*\*] ?)(.*$)", "\\2", "regex")
>
> doesn't work. I don't have experiences with this function (what does
> "\\2" mean)?
What you'd enter into the Replace/Find dialog with a backslash (\), you
must type with a double backslash in a nedit macro string. That's why you
use "\\2" to pick up the text of group 2 in your replacement (that which
matches the part of your pattern in the second set of parentheses). It's
up to you whether to use "\\t" instead of "\t" in the case of regex
interpretation: the first is passed as \t to the regex engine (where it's
treated as indicating a tab character) , the second passes a tab in the
string. As for "[ \\t]", you can write "\\s" for "match a space, tab or other
space-like character".
Now your "[\*\*]" means the same as "[**]" (since the sequence \* doesn't
mean anything special in a macro string, unlike \t or \n). In a regex, the
square brackets form a character class: a single character must match one of
the characters in the class. You have the same character twice: a *. What you
want is to recognise two in sequence. You could write [*][*] - a bit heavy
duty. Instead you want to match **, but since * is a regex special character,
you need to escape with backslashes, which need to be doubled in the macro
string. This makes your search pattern "(^[ \\t]*\\*\\* ?)(.*$)". But you can
do without the groups: try
replace_in_selection("^\\s*\\*\\* ?", "", "regex")
which doesn't bother with replacing the "rest of line" group 2, just replacing
the front of the line. But this loses any indentation in front of the **, so
you could save those spaces at the front into group 1 (and forgetting the
actual ** sequence):
replace_in_selection("^(\\s*)\\*\\* ?", "\\1", "regex")
>
> --
> NEdit Discuss mailing list - Discuss at nedit.org
> http://www.nedit.org/mailman/listinfo/discuss
>
More information about the Discuss
mailing list