{"id":19132,"date":"2021-08-04T06:50:59","date_gmt":"2021-08-03T20:50:59","guid":{"rendered":"https:\/\/database.guide\/?p=19132"},"modified":"2021-08-06T13:39:58","modified_gmt":"2021-08-06T03:39:58","slug":"regexp_instr-function-in-oracle","status":"publish","type":"post","link":"https:\/\/database.guide\/regexp_instr-function-in-oracle\/","title":{"rendered":"REGEXP_INSTR() Function in Oracle"},"content":{"rendered":"\n<p>In Oracle, the <code>REGEXP_INSTR()<\/code> function searches a string for a regular expression pattern. It returns an integer indicating the beginning or ending position of the matched substring (whichever one you specify).<\/p>\n\n\n\n<p>It extends the functionality of the <code><a href=\"https:\/\/database.guide\/instr-function-in-oracle\/\" data-type=\"post\" data-id=\"19210\">INSTR()<\/a><\/code> function by allowing us to use regular expression patterns. <\/p>\n\n\n\n<!--more-->\n\n\n\n<h2 class=\"wp-block-heading\">Syntax<\/h2>\n\n\n\n<p>The syntax goes like this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>REGEXP_INSTR ( source_char, pattern\n               &#91;, position\n                  &#91;, occurrence\n                     &#91;, return_opt\n                        &#91;, match_param\n                           &#91;, subexpr ]\n                        ]\n                     ]\n                  ]\n               ]\n             )<\/code><\/pre>\n\n\n\n<p>Where:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><code><em>source_char<\/em><\/code> is a character expression that serves as the search value.<\/li><li><em><code>pattern<\/code><\/em>&nbsp;is the regular expression.<\/li><li><em><code>position<\/code><\/em>&nbsp;is a positive integer that specifies where to begin the search. The default is <code>1<\/code>, meaning, start the search at the first character.<\/li><li><em><code>occurrence<\/code><\/em>&nbsp;is a positive integer that specifies which occurrence to search for. The default is <code>1<\/code>, which means searches for the first occurrence.<\/li><li><code><em>return_opt<\/em><\/code> specifies whether Oracle should return the beginning or ending position of the matched substring. Use <code>0<\/code> for the beginning, and <code>1<\/code> for the ending. The default value is <code>0<\/code>. <\/li><li><em><code>match_param<\/code><\/em>&nbsp;lets you change the default matching behaviour of the function. For example, it allows you to specify case-sensitivity, how multiple lines and spaces are dealt with, etc. This argument works the same as when used with the <code>REGEXP_COUNT()<\/code> function. See <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.oracle.com\/en\/database\/oracle\/oracle-database\/19\/sqlrf\/REGEXP_COUNT.html\" data-type=\"URL\" data-id=\"https:\/\/docs.oracle.com\/en\/database\/oracle\/oracle-database\/19\/sqlrf\/REGEXP_COUNT.html\" target=\"_blank\">Oracle&#8217;s documentation<\/a> for that function for more information.<\/li><li>For a&nbsp;<em><code>pattern<\/code><\/em>&nbsp;with subexpressions,&nbsp;<em><code>subexpr<\/code><\/em>&nbsp;is a nonnegative integer from 0 to 9 indicating which subexpression in&nbsp;<em><code>pattern<\/code><\/em>&nbsp;is to be returned by the function.&nbsp;This argument works the same as when used with the <code>REGEXP_INSTR()<\/code> function. See <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.oracle.com\/en\/database\/oracle\/oracle-database\/19\/sqlrf\/REGEXP_INSTR.html\" data-type=\"URL\" data-id=\"https:\/\/docs.oracle.com\/en\/database\/oracle\/oracle-database\/19\/sqlrf\/REGEXP_INSTR.html\" target=\"_blank\">Oracle&#8217;s documentation<\/a> for that function more information.<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Example<\/h2>\n\n\n\n<p>Here&#8217;s a basic example of using <code>REGEXP_INSTR()<\/code> in Oracle:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT \n    REGEXP_INSTR('<meta charset=\"utf-8\">My dogs are fluffy', 'd.g')\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">4<\/pre>\n\n\n\n<p>In this case there&#8217;s a match, and the beginning position of the substring is returned.<\/p>\n\n\n\n<p>Regular expressions can be very powerful, and this example uses a very simple example. In order to use <code>REGEXP_INSTR()<\/code> effectively, you&#8217;ll need to know the correct pattern to use for the desired outcome. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">No Match<\/h2>\n\n\n\n<p>Here&#8217;s an example where there&#8217;s no match:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT REGEXP_INSTR('My dogs like dregs', 't.g')\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">0<\/pre>\n\n\n\n<p>There&#8217;s no match, so <code>0<\/code> is returned.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Multiple Matches<\/h2>\n\n\n\n<p>Here&#8217;s an example with multiple matches:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT \n    REGEXP_INSTR('My dogs have dags', 'd.g')\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">4<\/pre>\n\n\n\n<p>It returned the position of the first occurrence.<\/p>\n\n\n\n<p>However, you can specify which occurrence to replace:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT \nREGEXP_INSTR('My dogs have dags', 'd.g', 1, 2)\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">14<\/pre>\n\n\n\n<p>Note that I added two arguments here;  <code>1<\/code> and <code>2<\/code>. The <code>1<\/code> specifies whereabouts in the string to start the search (in this case, at the first character). The <code>2<\/code> is what specifies which occurrence to search for. In this case, the second occurrence is searched for.<\/p>\n\n\n\n<p>Here&#8217;s what happens if I start the search after the first occurrence:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT \nREGEXP_INSTR('My dogs have dags', 'd.g', 8, 2)\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">0<\/pre>\n\n\n\n<p>In this case there&#8217;s no match, because there&#8217;s only one more occurrence after the starting position.<\/p>\n\n\n\n<p>If I change the last argument to <code>1<\/code>, then we get a match (because it&#8217;s the first occurrence after the specified starting position):<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT \nREGEXP_INSTR('My dogs have dags', 'd.g', 8, 1)\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">14<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Return the End Position<\/h2>\n\n\n\n<p>You can pass a fifth argument of either <code>0<\/code> or <code>1<\/code> to specify whether the function should return the beginning or end position of the substring. <\/p>\n\n\n\n<p>The default value is <code>0<\/code> (for the beginning position). Here&#8217;s what happens if we specify <code>1<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT \n    REGEXP_INSTR('My dogs are fluffy', 'd.g', 1, 1, 1)\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">7<\/pre>\n\n\n\n<p>Just to be clear, here it is again when compared with <code>0<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT \n    REGEXP_INSTR('My dogs are fluffy', 'd.g', 1, 1, 0) AS \"Start\",<meta charset=\"utf-8\">\n    REGEXP_INSTR('My dogs are fluffy', 'd.g', 1, 1, 1)<meta charset=\"utf-8\"> AS \"End\"\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">   Start    End \n________ ______ \n       4      7<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Case Sensitivity<\/h2>\n\n\n\n<p>The <code>REGEXP_INSTR()<\/code> function follows Oracle&#8217;s collation determination and derivation rules, which define the collation to use when matching the string with the pattern. <\/p>\n\n\n\n<p>However, you can explicitly specify case-sensitivity with the optional sixth argument. When you do this, it overrides any case-sensitivity or accent-sensitivity of the determined collation.<\/p>\n\n\n\n<p>You can specify <code>i<\/code> for case-insensitive matching and <code>c<\/code> for case-sensitive matching.<\/p>\n\n\n\n<p>Here&#8217;s an example:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT \n    REGEXP_INSTR('My Cats', 'c.t', 1, 1, 0) AS \"Default\",\n    REGEXP_INSTR('My Cats', 'c.t', 1,<meta charset=\"utf-8\"> 1, 0, 'i') AS \"Case Insensitive\",\n    REGEXP_INSTR('My Cats', 'c.t', 1,<meta charset=\"utf-8\"> 1, 0, 'c') AS \"Case Sensitive\"\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">   Default    Case Insensitive    Case Sensitive \n__________ ___________________ _________________ \n         0                   4                 0<\/pre>\n\n\n\n<p>My collation appears to be case-sensitive, based on these results. The other two strings were forced to a case-insensitive and case-sensitive matching respectively.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Subexpressions<\/h2>\n\n\n\n<p>Here&#8217;s an example of using the sixth argument to return a specific subexpression pattern:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT REGEXP_INSTR(\n    'catdogcow', \n    '(c.t)(d.g)(c.w)', \n    1, 1, 0, 'i', 1\n    )\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">1<\/pre>\n\n\n\n<p>In this case I returned the first subexpression.<\/p>\n\n\n\n<p>Here&#8217;s what happens if I specify the third subexpression:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT REGEXP_INSTR(\n    'catdogcow', \n    '(c.t)(d.g)(c.w)', \n    1, 1, 0, 'i', 3\n    )\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">7<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Null Arguments<\/h2>\n\n\n\n<p>With the exception of the 6th argument, providing <code>null<\/code> for an argument results in <code>null<\/code>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SET NULL 'null';\nSELECT \n    REGEXP_INSTR(null, 'c.t', 1, 1, 0, 'i', 1) AS \"1\",\n    <meta charset=\"utf-8\">REGEXP_INSTR('Cat', null, 1, 1,<meta charset=\"utf-8\"> 0, 'i', 1) AS \"2\",\n    <meta charset=\"utf-8\">REGEXP_INSTR('Cat', 'c.t', null, 1,<meta charset=\"utf-8\"> 0, 'i', 1) AS \"3\",\n    <meta charset=\"utf-8\">REGEXP_INSTR('Cat', 'c.t', 1, null,<meta charset=\"utf-8\"> 0, 'i', 1) AS \"4\",\n    <meta charset=\"utf-8\">REGEXP_INSTR('Cat', 'c.t', 1, 1, null, 'i', 1) AS \"5\",<meta charset=\"utf-8\">\n    <meta charset=\"utf-8\">REGEXP_INSTR('Cat', 'c.t', 1, 1,<meta charset=\"utf-8\"> 0, null, 1) AS \"6\",\n    <meta charset=\"utf-8\">REGEXP_INSTR('Cat', 'c.t', 1, 1,<meta charset=\"utf-8\"> 0, 'i', null) AS \"7\"\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">      1       2       3       4       5    6       7 \n_______ _______ _______ _______ _______ ____ _______ \n   null    null    null    null    null    0    null <\/pre>\n\n\n\n<p>By default, SQLcl and SQL*Plus return a blank space whenever <code>null<\/code> occurs as a result of a SQL <code><a href=\"https:\/\/database.guide\/sql-select-for-beginners\/\" data-type=\"post\" data-id=\"11928\">SELECT<\/a><\/code> statement. <\/p>\n\n\n\n<p>However, you can use <code><a href=\"https:\/\/database.guide\/set-null-specify-a-string-to-return-whenever-a-null-value-occurs-in-sqlcl-sqlplus\/\">SET NULL<\/a><\/code> to specify a different string to be returned. Here I specified that the string <code>null<\/code> should be returned.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Wrong Number of Arguments<\/h2>\n\n\n\n<p>Passing no arguments to the function, or too few, results in an error:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT REGEXP_INSTR()\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">Error starting at line : 1 in command -\nSELECT REGEXP_INSTR()\nFROM DUAL\nError at Command Line : 1 Column : 8\nError report -\nSQL Error: ORA-00938: not enough arguments for function\n00938. 00000 -  \"not enough arguments for function\"\n*Cause:    \n*Action:<\/pre>\n\n\n\n<p>The same applies when we pass too many arguments:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>SELECT REGEXP_INSTR('Cat', 'c.t', 1, 1, 1, 'i', 1, 'oops')\nFROM DUAL;<\/code><\/pre>\n\n\n\n<p>Result:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">Error starting at line : 1 in command -\nSELECT REGEXP_INSTR('Cat', 'c.t', 1, 1, 1, 'i', 1, 'oops')\nFROM DUAL\nError at Command Line : 1 Column : 8\nError report -\nSQL Error: ORA-00939: too many arguments for function\n00939. 00000 -  \"too many arguments for function\"\n*Cause:    \n*Action:<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">More Information<\/h2>\n\n\n\n<p>The <code>REGEXP_INSTR()<\/code> function (as well as Oracle&#8217;s other implementation of regular expressions) conforms with the IEEE Portable Operating System Interface (POSIX) regular expression standard and to the Unicode Regular Expression Guidelines of the Unicode Consortium.<\/p>\n\n\n\n<p>See the <a href=\"https:\/\/docs.oracle.com\/en\/database\/oracle\/oracle-database\/19\/sqlrf\/REGEXP_INSTR.html\" data-type=\"URL\" data-id=\"https:\/\/docs.oracle.com\/en\/database\/oracle\/oracle-database\/19\/sqlrf\/REGEXP_INSTR.html\" target=\"_blank\" rel=\"noreferrer noopener\">Oracle documentation<\/a> for more information and examples of the <code>REGEXP_INSTR()<\/code> function.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In Oracle, the REGEXP_INSTR() function searches a string for a regular expression pattern. It returns an integer indicating the beginning or ending position of the matched substring (whichever one you specify). It extends the functionality of the INSTR() function by allowing us to use regular expression patterns.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[41],"tags":[93,86,67,20],"class_list":["post-19132","post","type-post","status-publish","format-standard","hentry","category-oracle","tag-functions","tag-regex","tag-string-functions","tag-what-is"],"_links":{"self":[{"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/posts\/19132","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/comments?post=19132"}],"version-history":[{"count":9,"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/posts\/19132\/revisions"}],"predecessor-version":[{"id":19312,"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/posts\/19132\/revisions\/19312"}],"wp:attachment":[{"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/media?parent=19132"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/categories?post=19132"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/database.guide\/wp-json\/wp\/v2\/tags?post=19132"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}