How to Get Substring from a String in Oracle SQL?

In Oracle SQL, you get a substring from a string using the SUBSTR function. The SUBSTR function is one of the most versatile and commonly used string manipulation functions in Oracle Database. In this tutorial, we'll explore how to use the SUBSTR function effectively, understand its variants, and see practical examples of its application.

Introduction to SUBSTR

The SUBSTR function extracts a portion of a string starting at a specified position and continuing for a specified number of characters. It's an essential tool for string manipulation in SQL queries, PL/SQL code, and data transformations.

Basic Syntax

SUBSTR(string, position [, substring_length])

Parameters:

  • string: The source string from which to extract a substring
  • position: The starting position for extraction (can be positive or negative)
  • substring_length: Optional. The number of characters to extract

How the SUBSTR Function Position Works

The position parameter in SUBSTR has some nuances worth understanding:

  • If position > 0: Counts from the beginning of the string (1 is the first character)
  • If position = 0: Treated as position 1 (first character)
  • If position < 0: Counts backward from the end of the string (-1 is the last character)

Basic Examples of Extracting Substrings in Oracle SQL

Let's start with some simple examples to demonstrate how SUBSTR works:

-- Extract 4 characters starting from position 3
SELECT SUBSTR('ORACLE DATABASE', 3, 4) AS result FROM DUAL;
-- Result: ACLE

-- Extract from position 8 to the end (substring_length omitted)
SELECT SUBSTR('ORACLE DATABASE', 8) AS result FROM DUAL;
-- Result: DATABASE

-- Position 0 is treated as position 1
SELECT SUBSTR('ORACLE DATABASE', 0, 6) AS result FROM DUAL;
-- Result: ORACLE

-- Negative position counts from the end
SELECT SUBSTR('ORACLE DATABASE', -8, 4) AS result FROM DUAL;
-- Result: DATA

Oracle SQL SUBSTR Variants

Oracle provides several variants of the SUBSTR function to handle different character encodings:

  • SUBSTR: Uses characters as defined by the input character set
  • SUBSTRB: Uses bytes instead of characters
  • SUBSTRC: Uses Unicode complete characters
  • SUBSTR2: Uses UCS2 code points
  • SUBSTR4: Uses UCS4 code points

When to Use Different Variants

  • Use SUBSTR for most general string operations
  • Use SUBSTRB when working with byte-level operations or when dealing with multi-byte character sets where byte positions are important
  • Use SUBSTRC, SUBSTR2, or SUBSTR4 for specific Unicode handling requirements

Working with Different Data Types

The SUBSTR function works with various character data types:

-- With VARCHAR2
SELECT SUBSTR('Oracle Database 23', 8, 8) FROM DUAL;
-- Result: Database

-- With CHAR (returns VARCHAR2)
DECLARE
  v_char CHAR(20) := 'Oracle Database 23';
BEGIN
  DBMS_OUTPUT.PUT_LINE(SUBSTR(v_char, 8, 8));
END;
-- Result: Database

-- With CLOB
DECLARE
  v_clob CLOB := 'Oracle Database 23 is the latest version';
  v_result VARCHAR2(100);
BEGIN
  v_result := SUBSTR(v_clob, 1, 15);
  DBMS_OUTPUT.PUT_LINE(v_result);
END;
-- Result: Oracle Database

Practical Applications

1. Extracting Parts of Names

-- Extracting first name and last name from a full name
SELECT 
  SUBSTR(full_name, 1, INSTR(full_name, ' ')-1) AS first_name,
  SUBSTR(full_name, INSTR(full_name, ' ')+1) AS last_name
FROM 
  (SELECT 'John Smith' AS full_name FROM DUAL);

-- Result:
-- FIRST_NAME  LAST_NAME
-- ----------  ---------
-- John        Smith

2. Masking Sensitive Data

-- Mask credit card numbers, showing only last 4 digits
SELECT 
  'XXXX-XXXX-XXXX-' || SUBSTR(credit_card_number, -4) AS masked_cc_number
FROM 
  (SELECT '1234-5678-9012-3456' AS credit_card_number FROM DUAL);

-- Result: XXXX-XXXX-XXXX-3456

3. Extracting Domain from Email

-- Extract domain from email addresses
SELECT 
  email,
  SUBSTR(email, INSTR(email, '@')+1) AS domain
FROM 
  (SELECT 'user@example.com' AS email FROM DUAL);

-- Result:
-- EMAIL             DOMAIN
-- ----------------- ---------
-- user@example.com  example.com

4. Working with Date Strings

-- Extract year, month, and day from a date string in format 'YYYY-MM-DD'
SELECT 
  date_string,
  SUBSTR(date_string, 1, 4) AS year,
  SUBSTR(date_string, 6, 2) AS month,
  SUBSTR(date_string, 9, 2) AS day
FROM 
  (SELECT '2023-11-15' AS date_string FROM DUAL);

-- Result:
-- DATE_STRING  YEAR  MONTH  DAY
-- -----------  ----  -----  ---
-- 2023-11-15   2023  11     15

Multi-byte Character Handling

When working with multi-byte character sets, SUBSTR and SUBSTRB can behave differently:

-- Assuming a database with a multi-byte character set like AL32UTF8
-- Chinese characters (each typically 3 bytes in UTF-8)
SELECT 
  '你好世界' AS original,
  SUBSTR('你好世界', 2, 2) AS substr_result,
  SUBSTRB('你好世界', 4, 6) AS substrb_result
FROM DUAL;

-- Result:
-- ORIGINAL  SUBSTR_RESULT  SUBSTRB_RESULT
-- --------  -------------  --------------
-- 你好世界    好世            好世

In this example:

  • SUBSTR treats each Chinese character as one character
  • SUBSTRB works with bytes, so position 4 starts at the second character (since each character is 3 bytes)

Performance Considerations

  • SUBSTR is highly optimized in Oracle and generally performs well
  • For very large strings (especially CLOBs), consider the performance impact
  • When working with CLOBs, use DBMS_LOB.SUBSTR for better performance with large objects
-- Using DBMS_LOB.SUBSTR for CLOB data
DECLARE
  v_clob CLOB;
  v_result VARCHAR2(100);
BEGIN
  -- Assume v_clob contains a large amount of text
  v_clob := 'This is a very long text...'; -- Simplified for example
  
  -- Using DBMS_LOB.SUBSTR for better performance with large CLOBs
  v_result := DBMS_LOB.SUBSTR(v_clob, 20, 1);
  DBMS_OUTPUT.PUT_LINE(v_result);
END;

Common Mistakes and Solutions

1. Incorrect Position Parameter

-- Trying to extract the last 4 characters using a positive position
-- Incorrect approach:
SELECT SUBSTR('ORACLE', 4, 4) FROM DUAL; -- Returns 'CLE'

-- Correct approach:
SELECT SUBSTR('ORACLE', -4) FROM DUAL; -- Returns 'ACLE'

2. Not Handling NULL Values

-- SUBSTR returns NULL if the input string is NULL
SELECT SUBSTR(NULL, 1, 5) FROM DUAL; -- Returns NULL

-- Using NVL to handle NULL values
SELECT SUBSTR(NVL(column_name, 'N/A'), 1, 5) FROM your_table;

3. Forgetting That Position 0 Equals Position 1

-- Position 0 is treated as position 1
SELECT SUBSTR('ORACLE', 0, 3) FROM DUAL; -- Returns 'ORA', not 'RAC'

Advanced Examples of Oracle SQL SUBSTR Function

1. Combining SUBSTR with Other String Functions

-- Extract the second word from a string
SELECT 
  SUBSTR(
    'Oracle Database 23',
    INSTR('Oracle Database 23', ' ') + 1,
    INSTR('Oracle Database 23', ' ', 1, 2) - INSTR('Oracle Database 23', ' ') - 1
  ) AS second_word
FROM DUAL;

-- Result: Database

2. Using SUBSTR in UPDATE Statements

-- Update email domains from .com to .org
UPDATE employees
SET email = SUBSTR(email, 1, INSTR(email, '@')) || 'example.org'
WHERE SUBSTR(email, INSTR(email, '@') + 1) = 'example.com';

3. Using SUBSTR in WHERE Clauses

-- Find all products with codes starting with 'AB'
SELECT product_name, product_code
FROM products
WHERE SUBSTR(product_code, 1, 2) = 'AB';

4. Dynamic Substring Extraction

-- Extract a substring based on dynamic start and end markers
DECLARE
  v_text VARCHAR2(100) := 'The [important] information is here';
  v_start_marker VARCHAR2(10) := '[';
  v_end_marker VARCHAR2(10) := ']';
  v_result VARCHAR2(100);
BEGIN
  v_result := SUBSTR(
                v_text,
                INSTR(v_text, v_start_marker) + 1,
                INSTR(v_text, v_end_marker) - INSTR(v_text, v_start_marker) - 1
              );
  DBMS_OUTPUT.PUT_LINE(v_result);
END;
-- Result: important

See also: Oracle SQL query to calculate business days

Conclusion

The SUBSTR function is a powerful tool for string manipulation in Oracle Database. By understanding its behavior with different position values, character sets, and data types, you can effectively extract and manipulate string data in your applications.

Remember these key points:

  • Position can be positive (counting from start), zero (treated as 1), or negative (counting from end)
  • If substring_length is omitted, all characters to the end are returned
  • Different variants (SUBSTRB, SUBSTRC, etc.) handle different character encoding scenarios
  • The function works with various character data types including VARCHAR2, CHAR, NVARCHAR2, and CLOB

With these techniques, you can handle a wide range of string manipulation tasks in your Oracle Database applications.

Reference: Oracle SQL SUBSTR

Vinish Kapoor
Vinish Kapoor

Vinish Kapoor is a seasoned software development professional and a fervent enthusiast of artificial intelligence (AI). His impressive career spans over 25+ years, marked by a relentless pursuit of innovation and excellence in the field of information technology. As an Oracle ACE, Vinish has distinguished himself as a leading expert in Oracle technologies, a title awarded to individuals who have demonstrated their deep commitment, leadership, and expertise in the Oracle community.

guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments