mercredi 12 septembre 2018

python regex if|else not working as advertised?

I'm trying to educate myself on how the if|else pattern matching works in python and so created the following test from the documentation. As near as I can tell it's not working per documentation but I've learned to assume I missed a critical step somewhere.

In this test case the third item should fail because it's missing its closing '>'.

In [1]: import re, sys
In [2]: regex = re.compile('(<)?(\w+@\w+(?:\.\w+)+)(?(1)>|$)')
In [3]: cases  = ['<user@host.com>', 'user@host.com', '<user@host.com', 'user@host.com>']

In [4]: [ re.search(regex, _) and ("match:", _) or ("fail:", _) for _ in cases ]
Out[4]:
[('match:', '<user@host.com>'),
 ('match:', 'user@host.com'),
 ('match:', '<user@host.com'),
 ('fail:', 'user@host.com>')]

In [5]: sys.version
Out[5]: '3.6.5 |Anaconda custom (64-bit)| (default, Apr 26 2018, 08:42:37) \n[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]'

Relevant:

(?(id/name)yes-pattern|no-pattern)

Will try to match with yes-pattern if the group with given id or name exists, and with no-pattern if it doesn’t. no-pattern is optional and can be omitted. For example, (<)?(\w+@\w+(?:\.\w+)+)(?(1)>|$) is a poor email matching pattern, which will match with '<user@host.com>' as well as 'user@host.com', but not with '<user@host.com' nor 'user@host.com>'.

So my question is, what step did I miss? Tried on different python versions and hosts/os.

Aucun commentaire:

Enregistrer un commentaire