Issue 9390 - Option for verbose regular expressions
Summary: Option for verbose regular expressions
Status: NEW
Alias: None
Product: D
Classification: Unclassified
Component: phobos (show other issues)
Version: D2
Hardware: All All
: P4 enhancement
Assignee: No Owner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-24 18:14 UTC by bearophile_hugs
Modified: 2024-12-01 16:16 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this issue.
Description bearophile_hugs 2013-01-24 18:14:01 UTC
I'd really like an option to write "verbose" regular expressions in D, like in Python:

http://docs.python.org/2/library/re.html


> re.X
> re.VERBOSE
> 
>     This flag allows you to write regular expressions that look
>     nicer. Whitespace within the pattern is ignored, except when in a
>     character class or preceded by an unescaped backslash, and, when
>     a line contains a '#' neither in a character class or preceded by
>     an unescaped backslash, all characters from the leftmost such '#'
>     through the end of the line are ignored.
> 
>     That means that the two following regular expression objects that
>     match a decimal number are functionally equal:
> 
>     a = re.compile(r"""\d +  # the integral part
>                        \.    # the decimal point
>                        \d *  # some fractional digits""", re.X)
>     b = re.compile(r"\d+\.\d*")


RE code is code like every other, so it enjoys comments, a nicer indenting and formatting.

Making RE more readable helps their debug and understand. In my Python code all RE longer than half a line of chars are "verbose".
Comment 1 Dmitry Olshansky 2013-01-25 12:13:45 UTC
How about adding the common extensions that is called comments inside regular expression.

I can't recall synatx off-hand but it's something like:
(?# some comment that is ignored)


Plus you can already use any of the follwoing:

auto pattern - r"the first piece" // comment
r"the second piece" //comment 2
...
r" the last piece"; //last comment


Or if implicit concatenation feels too dirty:

auto pattern - r"the first piece"  // comment
~ r"the second piece" //comment 2
...
~ r" the last piece"; //last comment

Either way free-form regex + top-level explanatory note is enough by my standards. The rationale is if you have to explan every piece in isolation then it's one of 2 cases: you are explaning machanics to people that don't know what regex is (and it's wrong) or the regex pattern is too darn complex for its own good.

Since this is enhancement request I hereby propose 2 ways to solve it: close as won't fix or add the aformentioned extension for comments (that at least is more or less common). I'm not going to add another option that messes with syntax rules.
Comment 2 dlangBugzillaToGithub 2024-12-01 16:16:14 UTC
THIS ISSUE HAS BEEN MOVED TO GITHUB

https://github.com/dlang/phobos/issues/9596

DO NOT COMMENT HERE ANYMORE, NOBODY WILL SEE IT, THIS ISSUE HAS BEEN MOVED TO GITHUB