{"id":396,"date":"2024-01-28T12:20:05","date_gmt":"2024-01-28T12:20:05","guid":{"rendered":"https:\/\/learnpython.elegantwallp.com\/?p=396"},"modified":"2024-01-28T12:20:06","modified_gmt":"2024-01-28T12:20:06","slug":"python-regex-split","status":"publish","type":"post","link":"https:\/\/learnpython.elegantwallp.com\/2024\/01\/28\/python-regex-split\/","title":{"rendered":"Python Regex split()"},"content":{"rendered":"\n<p><strong>Summary<\/strong>: in this tutorial, you\u2019ll learn how to use the Python regex&nbsp;<code>split()<\/code>&nbsp;function to split a string at the occurrences of matches of a regular expression.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Introduction to the Python regex split() function<\/h2>\n\n\n\n<p>The built-in\u00a0<code>re<\/code>\u00a0module provides you with the\u00a0<code>split()<\/code>\u00a0function that splits a string by the matches of a\u00a0regular expression.<\/p>\n\n\n\n<p>The\u00a0<code>split()<\/code>\u00a0function has the following syntax:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code>split(pattern, string, maxsplit=0, flags=0)<\/code><\/code><\/pre>\n\n\n\n<p>In this syntax:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>pattern<\/code>&nbsp;is a regular expression whose matches will be used as separators for splitting.<\/li>\n\n\n\n<li><code>string<\/code>&nbsp;is an input string to split.<\/li>\n\n\n\n<li><code>maxsplit<\/code>&nbsp;determines at most the splits occur. Generally, if the&nbsp;<code>maxsplit<\/code>&nbsp;is one, the resulting list will have two elements. If the&nbsp;<code>maxsplit<\/code>&nbsp;is two, the resulting list will have three elements, and so on.<\/li>\n\n\n\n<li><code>flags<\/code>\u00a0parameter is optional and defaults to zero. The\u00a0<code>flags<\/code>\u00a0parameter accepts one or more\u00a0regex flags. The\u00a0<code>flags<\/code>\u00a0parameter changes how the regex engine matches the pattern.<\/li>\n<\/ul>\n\n\n\n<p>The&nbsp;<code>split()<\/code>&nbsp;function returns a list of substrings split by the matches of the pattern in the string.<\/p>\n\n\n\n<p>If the\u00a0<code>pattern<\/code>\u00a0contains one or more\u00a0capturing groups, the\u00a0<code>split()<\/code>\u00a0function will return the text of all groups as elements of the resulting list.<\/p>\n\n\n\n<p>If the&nbsp;<code>pattern<\/code>&nbsp;contains a capturing group that matches the start of a string, the&nbsp;<code>split()<\/code>&nbsp;function will return a resulting list with the first element being as an empty string. This logic is the same for the end of the string.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Python regex split() function examples<\/h2>\n\n\n\n<p>Let\u2019s take some examples of using the regex&nbsp;<code>split()<\/code>&nbsp;function.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1) Using the split() function to split words in a sentence<\/h3>\n\n\n\n<p>The following example uses the\u00a0<code>split()<\/code>\u00a0function to split the words in a sentence:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code>import re s = 'A! B. C D' pattern = r'\\W+' l = re.split(pattern, s) print(l) <\/code><small>Code language: JavaScript (javascript)<\/small><\/code><\/pre>\n\n\n\n<p>In this example, the\u00a0<code>\\W+<\/code>\u00a0is the inverse of the word\u00a0character set\u00a0that matches one or more characters that are not the word characters.<\/p>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code>&#91;'A', 'B', 'C', 'D']<\/code><small>Code language: JSON \/ JSON with Comments (json)<\/small><\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">2) Using the split() function with the maxsplit argument<\/h3>\n\n\n\n<p>The following example uses the\u00a0<code>split()<\/code>\u00a0function that splits a string with two splits at non-word characters:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code>import re s = 'A! B. C D' pattern = r'\\W+' l = re.split(pattern, s, 2) print(l) <\/code><small>Code language: JavaScript (javascript)<\/small><\/code><\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code>&#91;'A', 'B', 'C D']<\/code><small>Code language: JSON \/ JSON with Comments (json)<\/small><\/code><\/pre>\n\n\n\n<p>Because we split the string with two splits, the resulting list contains three elements. Notice that the&nbsp;<code>split()<\/code>&nbsp;function returns the remainder of a string as the final element in the resulting list.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">3) Using the split() function with a capturing group<\/h3>\n\n\n\n<p>The following example uses the\u00a0<code>split()<\/code>\u00a0function that splits a string with the\u00a0<code>\\W+<\/code>\u00a0pattern that contains a capturing group:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code>import re s = 'A! B. C D' pattern = r'(\\W+)' l = re.split(pattern, s, 2) print(l) <\/code><small>Code language: JavaScript (javascript)<\/small><\/code><\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code>&#91;'A', '! ', 'B', '. ', 'C D']<\/code><small>Code language: JSON \/ JSON with Comments (json)<\/small><\/code><\/pre>\n\n\n\n<p>In this example, the&nbsp;<code>split()<\/code>&nbsp;function also returns the text of the group in the resulting list.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">4) Using the split() function<\/h3>\n\n\n\n<p>The following example uses the\u00a0<code>split()<\/code>\u00a0function where the separator contains a capturing group that matches the start of the string:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code>import re s = '...A! B. C D' pattern = r'\\W+' l = re.split(pattern, s) print(l)<\/code><small>Code language: JavaScript (javascript)<\/small><\/code><\/pre>\n\n\n\n<p>In this case, the\u00a0<code>split()<\/code>\u00a0function returns a list with the first element is an empty string:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code>&#91;'', 'A', 'B', 'C', 'D']<\/code><small>Code language: JSON \/ JSON with Comments (json)<\/small><\/code><\/pre>\n\n\n\n<p>Similarly, if the separator contains the capturing groups and it matches the end of the string, the resulting list will have the last element as an empty string:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code>import re s = 'A! B. C D...' pattern = r'\\W+' l = re.split(pattern, s) print(l) <\/code><small>Code language: JavaScript (javascript)<\/small><\/code><\/pre>\n\n\n\n<p>Output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code>&#91;'A', 'B', 'C', 'D', '']<\/code><\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Summary: in this tutorial, you\u2019ll learn how to use the Python regex&nbsp;split()&nbsp;function to split a string at the occurrences of matches of a regular expression. Introduction to the Python regex split() function The built-in\u00a0re\u00a0module provides you with the\u00a0split()\u00a0function that splits a string by the matches of a\u00a0regular expression. The\u00a0split()\u00a0function has the following syntax: In this [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[42],"tags":[],"class_list":["post-396","post","type-post","status-publish","format-standard","hentry","category-2-python-regex"],"_links":{"self":[{"href":"https:\/\/learnpython.elegantwallp.com\/wp-json\/wp\/v2\/posts\/396","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/learnpython.elegantwallp.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/learnpython.elegantwallp.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/learnpython.elegantwallp.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/learnpython.elegantwallp.com\/wp-json\/wp\/v2\/comments?post=396"}],"version-history":[{"count":1,"href":"https:\/\/learnpython.elegantwallp.com\/wp-json\/wp\/v2\/posts\/396\/revisions"}],"predecessor-version":[{"id":397,"href":"https:\/\/learnpython.elegantwallp.com\/wp-json\/wp\/v2\/posts\/396\/revisions\/397"}],"wp:attachment":[{"href":"https:\/\/learnpython.elegantwallp.com\/wp-json\/wp\/v2\/media?parent=396"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/learnpython.elegantwallp.com\/wp-json\/wp\/v2\/categories?post=396"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/learnpython.elegantwallp.com\/wp-json\/wp\/v2\/tags?post=396"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}