分组

Created: November-22, 2018

分组是用括号完成的。调用 group() 将返回由匹配的带括号的子组形成的字符串。

match.group() # Group without argument returns the entire match found
# Out: '123'
match.group(0) # Specifying 0 gives the same result as specifying no argument
# Out: '123'

也可以向 group() 提供参数以获取特定子组。

来自文档：

如果只有一个参数，则结果为单个字符串; 如果有多个参数，则结果是一个元组，每个参数有一个项目。

另一方面，调用 groups() 会返回包含子组的元组列表。

sentence = "This is a phone number 672-123-456-9910"
pattern = r".*(phone).*?([\d-]+)"

match = re.match(pattern, sentence)

match.groups()   # The entire match as a list of tuples of the paranthesized subgroups
# Out: ('phone', '672-123-456-9910')

m.group()        # The entire match as a string
# Out: 'This is a phone number 672-123-456-9910'

m.group(0)       # The entire match as a string
# Out: 'This is a phone number 672-123-456-9910'

m.group(1)       # The first parenthesized subgroup.
# Out: 'phone'

m.group(2)       # The second parenthesized subgroup.
# Out: '672-123-456-9910'

m.group(1, 2)    # Multiple arguments give us a tuple.
# Out: ('phone', '672-123-456-9910')

命名组

match = re.search(r'My name is (?P<name>[A-Za-z ]+)', 'My name is John Smith')
match.group('name')
# Out: 'John Smith'

match.group(1)
# Out: 'John Smith'

创建可以按名称和索引引用的捕获组。

非捕获组

使用 (?:) 会创建一个组，但不会捕获该组。这意味着你可以将其用作组，但不会污染你的组空间。

re.match(r'(\d+)(\+(\d+))?', '11+22').groups()
# Out: ('11', '+22', '22')

re.match(r'(\d+)(?:\+(\d+))?', '11+22').groups()
# Out: ('11', '22')

此示例匹配 11+22 或 11，但不匹配 11+。这是因为+标志和第二个术语被分组。另一方面，未捕获+标志。