---
title: STRING
summary: The STRING data type stores a string of Unicode characters.
toc: true
docs_area: reference.sql
---
The `STRING` [data type]({% link {{ page.version.version }}/data-types.md %}) stores a string of Unicode characters.
{{site.data.alerts.callout_info}}
`STRING` is not a data type supported by PostgreSQL. For PostgreSQL compatibility, CockroachDB supports additional [aliases](#aliases) and [`STRING`-related types](#related-types).
{{site.data.alerts.end}}
## Aliases
CockroachDB supports the following alias for `STRING`:
- `TEXT`
## Related types
For PostgreSQL compatibility, CockroachDB supports the following `STRING`-related types and their aliases:
- `VARCHAR` (and alias `CHARACTER VARYING`)
- `CHAR` (and alias `CHARACTER`)
- `NAME`
These types are functionally identical to `STRING`.
CockroachDB also supports the single-byte `"char"` special character type. As in PostgreSQL, this special type is intended for internal use in [system catalogs]({% link {{ page.version.version }}/system-catalogs.md %}), and has a storage size of 1 byte. CockroachDB truncates all values of type `"char"` to a single character.
## Length
To limit the length of a string column, use `STRING(n)`, where `n` is the maximum number of Unicode code points (normally thought of as "characters") allowed. This applies to all related types as well (e.g., to limit the length of a `VARCHAR` type, use `VARCHAR(n)`). To reduce performance issues caused by storing very large string values in indexes, Cockroach Labs recommends setting length limits on string-typed columns.
{{site.data.alerts.callout_danger}}
{% include {{page.version.version}}/sql/add-size-limits-to-indexed-columns.md %}
{{site.data.alerts.end}}
When inserting a `STRING` value or a `STRING`-related-type value:
- If the value is cast with a length limit (e.g., `CAST('hello world' AS STRING(5))`), CockroachDB truncates to the limit. This applies to `STRING(n)` and all related types.
- If the value exceeds the column's length limit, CockroachDB returns an error. This applies to `STRING(n)` and all related types.
- For `STRING(n)` and `VARCHAR(n)`/`CHARACTER VARYING(n)` types, if the value is under the column's length limit, CockroachDB does **not** add space padding to the end of the value.
- For `CHAR(n)`/`CHARACTER(n)` types, if the value is under the column's length limit, CockroachDB adds space padding from the end of the value to the length limit.
Type | Length
--------------------------------------------------|------------------------------
`CHARACTER`, `CHARACTER(n)`, `CHAR`, `CHAR(n)` | Fixed-length
`CHARACTER VARYING(n)`, `VARCHAR(n)`, `STRING(n)` | Variable-length, with a limit
`TEXT`, `VARCHAR`, `CHARACTER VARYING`, `STRING` | Variable-length, with no limit
`"char"` (special type) | 1 byte
## Syntax
A value of type `STRING` can be expressed using a variety of formats.
See [string literals]({% link {{ page.version.version }}/sql-constants.md %}#string-literals) for more details.
When printing out a `STRING` value in the [SQL shell]({% link {{ page.version.version }}/cockroach-sql.md %}), the shell uses the simple
SQL string literal format if the value doesn't contain special character,
or the escaped format otherwise.
### Collations
`STRING` values accept [collations]({% link {{ page.version.version }}/collate.md %}), which lets you sort strings according to language- and country-specific rules.
## Size
The size of a `STRING` value is variable, but it's recommended to keep values under 64 kilobytes to ensure performance. Above that threshold, [write amplification]({% link {{ page.version.version }}/architecture/storage-layer.md %}#write-amplification) and other considerations may cause significant performance degradation.
## Examples
{% include_cached copy-clipboard.html %}
~~~ sql
> CREATE TABLE strings (a STRING PRIMARY KEY, b STRING(4), c TEXT);
~~~
{% include_cached copy-clipboard.html %}
~~~ sql
> SHOW COLUMNS FROM strings;
~~~
~~~
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
--------------+-----------+-------------+----------------+-----------------------+-----------+------------
a | STRING | false | NULL | | {primary} | false
b | STRING(4) | true | NULL | | {primary} | false
c | STRING | true | NULL | | {primary} | false
(3 rows)
~~~
{% include_cached copy-clipboard.html %}
~~~ sql
> INSERT INTO strings VALUES ('a1b2c3d4', 'e5f6', 'g7h8i9');
~~~
{% include_cached copy-clipboard.html %}
~~~ sql
> SELECT * FROM strings;
~~~
~~~
a | b | c
+----------+------+--------+
a1b2c3d4 | e5f6 | g7h8i9
(1 row)
~~~
{% include_cached copy-clipboard.html %}
~~~ sql
> CREATE TABLE aliases (a STRING PRIMARY KEY, b VARCHAR, c CHAR);
~~~
{% include_cached copy-clipboard.html %}
~~~ sql
> SHOW COLUMNS FROM aliases;
~~~
~~~
column_name | data_type | is_nullable | column_default | generation_expression | indices | is_hidden
+-------------+-----------+-------------+----------------+-----------------------+-------------+-----------+
a | STRING | false | NULL | | {primary} | false
b | VARCHAR | true | NULL | | {primary} | false
c | CHAR | true | NULL | | {primary} | false
(3 rows)
~~~
## Supported casting and conversion
`STRING` values can be [cast]({% link {{ page.version.version }}/data-types.md %}#data-type-conversions-and-casts) to any of the following data types:
Type | Details
-----|--------
`ARRAY` | Requires supported [`ARRAY`]({% link {{ page.version.version }}/array.md %}) string format, e.g., `'{1,2,3}'`.
Note that string literals can be implicitly cast to any supported `ARRAY` data type except [`BYTES`]({% link {{ page.version.version }}/bytes.md %}), [`ENUM`]({% link {{ page.version.version }}/enum.md %}), [`JSONB`]({% link {{ page.version.version }}/jsonb.md %}), [`SERIAL`](serial.html), and the [spatial data types](architecture/glossary.html#data-types) `Box2D`, `GEOGRAPHY`, and `GEOMETRY`.
`BIT` | Requires supported [`BIT`]({% link {{ page.version.version }}/bit.md %}) string format, e.g., `'101001'` or `'xAB'`.
`BOOL` | Requires supported [`BOOL`]({% link {{ page.version.version }}/bool.md %}) string format, e.g., `'true'`.
`BYTES` | For more details, [see here]({% link {{ page.version.version }}/bytes.md %}#supported-conversions).
`CITEXT` | Preserves the original letter case, but value comparisons are treated case-insensitively. Refer to [`CITEXT`]({% link {{ page.version.version }}/citext.md %}).
`DATE` | Requires supported [`DATE`]({% link {{ page.version.version }}/date.md %}) string format, e.g., `'2016-01-25'`.
`DECIMAL` | Requires supported [`DECIMAL`]({% link {{ page.version.version }}/decimal.md %}) string format, e.g., `'1.1'`.
`FLOAT` | Requires supported [`FLOAT`]({% link {{ page.version.version }}/float.md %}) string format, e.g., `'1.1'`.
`INET` | Requires supported [`INET`]({% link {{ page.version.version }}/inet.md %}) string format, e.g, `'192.168.0.1'`.
`INT` | Requires supported [`INT`]({% link {{ page.version.version }}/int.md %}) string format, e.g., `'10'`.
`INTERVAL` | Requires supported [`INTERVAL`]({% link {{ page.version.version }}/interval.md %}) string format, e.g., `'1h2m3s4ms5us6ns'`.
`JSONPATH` | Requires a valid [`JSONPath`]({% link {{ page.version.version }}/jsonpath.md %}) expression string, e.g., `'$'` or `'$.players[*] ? (@.stats.ppg > 30)'`.
`LTREE` | Requires supported [`LTREE`]({% link {{ page.version.version }}/ltree.md %}) string format, e.g., `'Animals.Mammals.Carnivora'`.
`TIME` | Requires supported [`TIME`]({% link {{ page.version.version }}/time.md %}) string format, e.g., `'01:22:12'` (microsecond precision).
`TIMESTAMP` | Requires supported [`TIMESTAMP`]({% link {{ page.version.version }}/timestamp.md %}) string format, e.g., `'2016-01-25 10:10:10.555555'`.
`TSQUERY` | Requires supported [`TSQUERY`]({% link {{ page.version.version }}/tsquery.md %}) string format, e.g., `'Requires & supported & TSQUERY & string & format'`.
Note that casting a string to a `TSQUERY` will not normalize the tokens into lexemes. To do so, [use `to_tsquery()`, `plainto_tsquery()`, or `phraseto_tsquery()`](#convert-string-to-tsquery).
`TSVECTOR` | Requires supported [`TSVECTOR`]({% link {{ page.version.version }}/tsvector.md %}) string format, e.g., `'Requires supported TSVECTOR string format.'`.
Note that casting a string to a `TSVECTOR` will not normalize the tokens into lexemes. To do so, [use `to_tsvector()`](#convert-string-to-tsvector).
`UUID` | Requires supported [`UUID`]({% link {{ page.version.version }}/uuid.md %}) string format, e.g., `'63616665-6630-3064-6465-616462656562'`.
### `STRING` vs. `BYTES`
While both `STRING` and `BYTES` can appear to have similar behavior in many situations, one should understand their nuance before casting one into the other.
`STRING` treats all of its data as characters, or more specifically, Unicode code points. `BYTES` treats all of its data as a byte string. This difference in implementation can lead to dramatically different behavior. For example, let's take a complex Unicode character such as ☃ ([the snowman emoji](https://emojipedia.org/snowman/)):
{% include_cached copy-clipboard.html %}
~~~ sql
> SELECT length('☃'::string);
~~~
~~~
length
+--------+
1
~~~
~~~ sql
> SELECT length('☃'::bytes);
~~~
~~~
length
+--------+
3
~~~
In this case, [`LENGTH(string)`]({% link {{ page.version.version }}/functions-and-operators.md %}#string-and-byte-functions) measures the number of Unicode code points present in the string, whereas [`LENGTH(bytes)`]({% link {{ page.version.version }}/functions-and-operators.md %}#string-and-byte-functions) measures the number of bytes required to store that value. Each character (or Unicode code point) can be encoded using multiple bytes, hence the difference in output between the two.
#### Translate literals to `STRING` vs. `BYTES`
A literal entered through a SQL client will be translated into a different value based on the type:
+ `BYTES` gives a special meaning to the pair `\x` at the beginning, and translates the rest by substituting pairs of hexadecimal digits to a single byte. For example, `\xff` is equivalent to a single byte with the value of 255. For more information, see [SQL Constants: String literals with character escapes]({% link {{ page.version.version }}/sql-constants.md %}#string-literals-with-character-escapes).
+ `STRING` does not give a special meaning to `\x`, so all characters are treated as distinct Unicode code points. For example, `\xff` is treated as a `STRING` with length 4 (`\`, `x`, `f`, and `f`).
### Cast hexadecimal digits to `BIT`
You can cast a `STRING` value of hexadecimal digits prefixed by `x` or `X` to a `BIT` value.
For example:
{% include_cached copy-clipboard.html %}
~~~ sql
> SELECT 'XAB'::BIT(8)
~~~
~~~
bit
------------
10101011
(1 row)
~~~
### Concatenate `STRING` values with values of other types
`STRING` values can be concatenated with any non-`ARRAY`, non-`NULL` type, resulting in a `STRING` value.
For example:
{% include_cached copy-clipboard.html %}
~~~ sql
> SELECT 1 || 'item';
~~~
~~~
?column?
------------
1item
(1 row)
~~~
{% include_cached copy-clipboard.html %}
~~~ sql
> SELECT true || 'item';
~~~
~~~
?column?
------------
titem
(1 row)
~~~
Concatenating a `STRING` value with a [`NULL` value]({% link {{ page.version.version }}/null-handling.md %}) results in a `NULL` value.
For example:
{% include_cached copy-clipboard.html %}
~~~ sql
> SELECT NULL || 'item';
~~~
~~~
?column?
------------
NULL
(1 row)
~~~
### Convert `STRING` to `TIMESTAMP`
You can use the [`parse_timestamp()` function]({% link {{ page.version.version }}/functions-and-operators.md %}) to parse strings in `TIMESTAMP` format.
{% include_cached copy-clipboard.html %}
~~~ sql
SELECT parse_timestamp ('2022-05-28T10:53:25.160Z');
~~~
~~~
parse_timestamp
--------------------------
2022-05-28 10:53:25.16
(1 row)
~~~
### Convert `STRING` to `TSVECTOR`
You can use the [`to_tsvector()` function]({% link {{ page.version.version }}/functions-and-operators.md %}#full-text-search-functions) to parse strings in [`TSVECTOR`]({% link {{ page.version.version }}/tsvector.md %}) format. This will normalize the tokens into lexemes, and will add an integer position to each lexeme.
{% include_cached copy-clipboard.html %}
~~~ sql
SELECT to_tsvector('How do trees get on the internet?');
~~~
~~~
to_tsvector
---------------------------------
'get':4 'internet':7 'tree':3
~~~
For more information on usage, see [Full-Text Search]({% link {{ page.version.version }}/full-text-search.md %}).
### Convert `STRING` to `TSQUERY`
You can use the [`to_tsquery()`, `plainto_tsquery()`, and `phraseto_tsquery()` functions]({% link {{ page.version.version }}/functions-and-operators.md %}#full-text-search-functions) to parse strings in [`TSQUERY`]({% link {{ page.version.version }}/tsquery.md %}) format. This will normalize the tokens into lexemes.
When using `to_tsquery()`, the string input must be formatted as a [`TSQUERY`]({% link {{ page.version.version }}/tsquery.md %}#syntax), with operators separating tokens.
{% include_cached copy-clipboard.html %}
~~~ sql
SELECT to_tsquery('How & do & trees & get & on & the & internet?');
~~~
~~~
to_tsquery
-------------------------------
'tree' & 'get' & 'internet'
~~~
For more information on usage, see [Full-Text Search]({% link {{ page.version.version }}/full-text-search.md %}).
## See also
- [Data Types]({% link {{ page.version.version }}/data-types.md %})
- [String literal syntax]({% link {{ page.version.version }}/sql-constants.md %}#string-literals)