--- layout: page title: PostgreSQL Types --- ## Overview The following are notes by Emil Lenngren on PostgreSQL wire representation of types: ~~~ bool: text: t or f binary: a byte: 1 or 0 bytea: text: either \x followed by hex-characters (lowercase by default), or plain characters, where non-printable characters (between 0x20 and 0x7e, inclusive) are written as \nnn (octal) and \ is written as \\ binary: the bytes as they are char: This type holds a single char/byte. (Not to be confused with bpchar (blank-padded char) which is PostgreSQL's alias to the SQL standard's char). The char may be the null-character text: the char as a byte, encoding seems to be ignored binary: the char as a byte name: A null-padded string of NAMEDATALEN (currently 64) bytes (the last byte must be a null-character). Used in pg catalog. text: the name as a string binary: the name as a string int2/int4/int8: text: text representation in base 10 binary: binary version of the integer int2vector: non-null elements, 0-indexed, 1-dim text: 1 2 3 4 binary: same as int2[] oidvector: non-null elements, 0-indexed, 1-dim text: 1 2 3 4 binary: same as oid[] regproc: internally just an OID (UInt32) text: -, name of procedure, or numeric if not found binary: only the OID in binary regprocedure/regoper/regoperator/regclass/regconfig/regdictionary: similar to regproc text: text: the string as it is binary: the string as it is oid: A 32-bit unsigned integer used for internal object identification. text: the text-representation of this integer in base 10 binary: the UInt32 tid: tuple id Internally a tuple of a BlockNumber (UInt32) and an OffsetNumber (UInt16) text: (blockNumber,offsetNumber) binary: the block number in binary followed by offset number in binary xid: transaction id Internally just a TransactionId (UInt32) text: the number binary: the number in binary cid: command id Internally just a CommandId (UInt32) text: the number binary: the number in binary json: json text: the json an text binary: the json as text jsonb: json internally stored in an efficient binary format text: the json as text binary: An Int32 (version number, currently 1), followed by data (currently just json as text) xml: Xml. It is probably most efficient to use the text format, especially when receiving from client. text: the xml as text (when sent from the server: encoding removed, when receiving: assuming database encoding) binary: the xml as text (when sent from the server: in the client's specified encoding, when receiving: figures out itself) pg_node_tree: used as type for the column typdefaultbin in pg_type does not accept input text: text binary: text smgr: storage manager can only have the value "magnetic disk" text: magnetic disk binary: not available point: A tuple of two float8 text: (x,y) The floats are interpreted with the C strtod function. The floats are written with the snprintf function, with %.*g format. NaN/-Inf/+Inf can be written, but not interpretability depends on platform. The extra_float_digits setting is honored. For linux, NaN, [+-]Infinity, [+-]Inf works, but not on Windows. Windows also have other output syntax for these special numbers. (1.#QNAN for example) binary: the two floats lseg: A tuple of two points text: [(x1,y1),(x2,y2)] see point for details binary: the four floats in the order x1, y1, x2, y2 path: A boolean whether the path is opened or closed + a vector of points. text: [(x1,y1),...] for open path and ((x1,y1),...) for closed paths. See point for details. binary: first a byte indicating open (0) or close (1), then the number of points (Int32), then a vector of points box: A tuple of two points. The coordinates will be reordered so that the first is the upper right and the second is the lower left. text: (x1,y1),(x2,y2) see point for details binary: the four floats in the order x1, y1, x2, y2 (doesn't really matter since they will be reordered) polygon: Same as path but with two differences: is always closed and internally stores the bounding box. text: same as closed path binary: the number of points (Int32), then a vector of points line (version 9.4): Ax + By + C = 0. Stored with three float8. Constraint: A and B must not both be zero (only checked on text input, not binary). text: {A,B,C} see point for details about the string representation of floats. Can also use the same input format as a path with two different points, representing the line between those. binary: the three floats circle: <(x,y),r> (center point and radius), stored with three float8. text: <(x,y),r> see point for details about the string representation of floats. binary: the three floats x, y, r in that order float4/float8: text: (leading/trailing whitespace is skipped) interpreted with the C strtod function, but since it has problems with NaN, [+-]Infinity, [+-]Inf, those strings are identified (case-insensitively) separately. when outputting: NaN, [+-]Infinity is treated separately, otherwise the string is printed with snprintf %.*g and the extra_float_digits setting is honored. binary: the float abstime: A unix timestamp stored as a 32-bit signed integer with seconds-precision (seconds since 1970-01-01 00:00:00), in UTC Has three special values: Invalid (2^31-1), infinity (2^31-3), -infinity (-2^31) text: same format as timestamptz, or "invalid", "infinity", "-infinity" binary: Int32 reltime: A time interval with seconds-precision (stored as an 32-bit signed integer) text: same as interval binary: Int32 tinterval: Consists of a status (Int32) and two abstimes. Status is valid (1) iff both abstimes are valid, else 0. Note that the docs incorrectly states that ' is used as quote instead of " text: ["" ""] binary: Int32 (status), Int32 (abstime 1), Int32 (abstime 2) unknown: text: text binary: text money: A 64-bit signed integer. For example, $123.45 is stored as the integer 12345. Number of fraction digits is locale-dependent. text: a locale-depedent string binary: the raw 64-bit integer macaddr: 6 bytes text: the 6 bytes in hex (always two characters per byte) separated by : binary: the 6 bytes appearing in the same order as when written in text inet/cidr: Struct of Family (byte: ipv4=2, ipv6=3), Netmask (byte with number of bits in the netmask), Ipaddr bytes (16) Text: The IP-address in text format and /netmask. /netmask is omitted in inet if the netmask is the whole address. Binary: family byte, netmask byte, byte (cidr=1, inet=0), number of bytes in address, bytes of the address aclitem: Access list item used in pg_class Text: Something like postgres=arwdDxt/postgres Binary: not available bpchar: Blank-padded char. The type modifier is used to blank-pad the input. text: text binary: text varchar: Variable-length char. The type modifier is used to check the input's length. text: text binary: text date: A signed 32-bit integer of a date. 0 = 2000-01-01. Infinity: INT_MAX, -Infinity: INT_MIN Text: Date only using the specified date style Binary: Int32 time: A signed 64-bit integer representing microseconds from 00:00:00.000000. (Legacy uses 64-bit float). Negative values are not allowed. Max value is 24:00:00.000000. text: hh:mm:ss or hh:mm:ss.ffffff where the fraction part is between 1 and 6 digits (trailing zeros are not written) binary: the 64-bit integer timetz: A struct of Time: A signed 64-bit integer representing microseconds from 00:00:00.000000. (Legacy uses 64-bit float). Negative values are not allowed. Max value is 24:00:00.000000. Zone: A signed 32-bit integer representing the zone (in seconds). Note that the sign is inverted. So GMT+1h is stored as -1h. text: hh:mm:ss or hh:mm:ss.ffffff where the fraction part is between 1 and 6 digits (trailing zeros are not written) binary: the 64-bit integer followed by the 32-bit integer timestamp: A signed 64-bit integer representing microseconds from 2000-01-01 00:00:00.000000 Infinity is LONG_MAX and -Infinity is LONG_MIN (Infinity would be 294277-01-09 04:00:54.775807) Earliest possible timestamp is 4714-11-24 00:00:00 BC. Even earlier would be possible, but due to internal calculations those are forbidden. text: dependent on date style binary: the 64-bit integer timestamptz: A signed 64-bit integer representing microseconds from 2000-01-01 00:00:00.000000 UTC. (Time zone is not stored). Infinity is LONG_MAX and -Infinity is LONG_MIN text: first converted to the time zone in the db settings, then printed according to the date style binary: the 64-bit integer interval: A struct of Time (Int64): all time units other than days, months and years (microseconds) Day (Int32): days, after time for alignment Month (Int32): months and years, after time for alignment text: Style dependent, but for example: "-11 mons +15435 days -11111111:53:00" binary: all fields in the struct bit/varbit: First a signed 32-bit integer containing the number of bits (negative length not allowed). Then all the bits in big end first. So a varbit of length 1 has the first (and only) byte set to either 0x80 or 0x00. Last byte is assumed (and is automatically zero-padded in recv) to be zero-padded. text: when sending from backend: all the bits, written with 1s and 0s. when receiving from client: (optionally b or B followed by) all the bits as 1s and 0s, or a x or X followed by hexadecimal digits (upper- or lowercase), big endian first. binary: the 32-bit length followed by the bytes containing the bits numeric: A variable-length numeric value, can be negative. text: NaN or first - if it is negative, then the digits with . as decimal separator binary: first a header of 4 16-bit signed integers: number of digits in the digits array that follows (can be 0, but not negative), weight of the first digit (10000^weight), can be both negative, positive or 0, sign: negative=0x4000, positive=0x0000, NaN=0xC000 dscale: number of digits (in base 10) to print after the decimal separator then the array of digits: The digits are stored in base 10000, where each digit is a 16-bit integer. Trailing zeros are not stored in this array, to save space. The digits are stored such that, if written as base 10000, the decimal separator can be inserted between two digits in base 10000, i.e. when this is to be printed in base 10, only the first digit in base 10000 can (possibly) be printed with less than 4 characters. Note that this does not apply for the digits after the decimal separator; the digits should be printed out in chunks of 4 characters and then truncated with the given dscale. refcursor: uses the same routines as text record: Describes a tuple. Is also the "base class" for composite types (i.e. it uses the same i/o functions). text: ( followed by a list of comma-separated text-encoded values followed by ). Empty element means null. Quoted with " and " if necessary. " is escaped with "" and \ is escaped with \\ (this differs from arrays where " is escaped with \"). Must be quoted if it is an empty string or contains one of "\,() or a space. binary: First a 32-bit integer with the number of columns, then for each column: An OID indicating the type of the column The length of the column (32-bit integer), or -1 if null The column data encoded as binary cstring: text/binary: all characters are sent without the trailing null-character void: Used for example as return value in SELECT * FROM func_returning_void() text: an empty string binary: zero bytes uuid: A 16-byte uuid. text: group of 8, 4, 4, 4, 12 hexadecimal lower-case characters, separated by -. The first byte is written first. It is allowed to surround it with {}. binary: the 16 bytes txid_snapshot: (txid is a UInt64) A struct of UInt32 nxip (size of the xip array) txid xmin (no values in xip is smaller than this) txid xmax (no values in xip is larger than or equal this) txid[] xip (is ordered in ascending order) text: xmin:xmax:1,2,3,4 binary: all fields in the structure tsvector: Used for text searching. Example of tsvector: 'a':1,6,10 'on':5 'and':8 'ate':9A 'cat':3 'fat':2,11 'mat':7 'rat':12 'sat':4 Max length for each lexeme string is 2046 bytes (excluding the trailing null-char) The words are sorted when parsed, and only written once. Positions are also sorted and only written once. For some reason, the unique check does not seem to be made for binary input, only text input... text: As seen above. ' is escaped with '' and \ is escaped with \\. binary: UInt32 number of lexemes for each lexeme: lexeme text in client encoding, null-terminated UInt16 number of positions for each position: UInt16 WordEntryPos, where the most significant 2 bits is weight, and the 14 least significant bits is pos (can't be 0). Weights 3,2,1,0 represent A,B,C,D tsquery: A tree with operands and operators (&, |, !). Operands are strings, with optional weight (bitmask of ABCD) and prefix search (yes/no, written with *). text: the tree written in infix notation. Example: ( 'abc':*B | 'def' ) & !'ghi' binary: the tree written in prefix notation: First the number of tokens (a token is an operand or an operator). For each token: UInt8 type (1 = val, 2 = oper) followed by For val: UInt8 weight + UInt8 prefix (1 = yes / 0 = no) + null-terminated string, For oper: UInt8 oper (1 = not, 2 = and, 3 = or, 4 = phrase). In case of phrase oper code, an additional UInt16 field is sent (distance value of operator). Default is 1 for <->, otherwise the n value in ''. enum: Simple text gtsvector: GiST for tsvector. Probably internal type. int4range/numrange/tsrange/tstzrange/daterange/int8range and user-defined range types: /* A range's flags byte contains these bits: */ #define RANGE_EMPTY 0x01 /* range is empty */ #define RANGE_LB_INC 0x02 /* lower bound is inclusive */ #define RANGE_UB_INC 0x04 /* upper bound is inclusive */ #define RANGE_LB_INF 0x08 /* lower bound is -infinity */ #define RANGE_UB_INF 0x10 /* upper bound is +infinity */ #define RANGE_LB_NULL 0x20 /* lower bound is null (NOT USED) */ #define RANGE_UB_NULL 0x40 /* upper bound is null (NOT USED) */ #define RANGE_CONTAIN_EMPTY 0x80/* marks a GiST internal-page entry whose * subtree contains some empty ranges */ A range has no lower bound if any of RANGE_EMPTY, RANGE_LB_INF (or RANGE_LB_NULL, not used anymore) is set. The same applies for upper bounds. text: A range with RANGE_EMPTY is just written as the string "empty". Inclusive bounds are written with [ and ], else ( and ) is used. The two values are comma-separated. Missing bounds are written as an empty string (without quotes). Each value is quoted with " if necessary. Quotes are necessary if the string is either the empty string or contains "\,()[] or spaces. " is escaped with "" and \ is escaped with \\. Example: [18,21] binary: First the flag byte. Then, if has lower bound: 32-bit length + binary-encoded data. Then, if has upper bound: 32-bit length + binary-encoded data. hstore: Key/value-store. Both keys and values are strings. text: Comma-space separated string, where each item is written as "key"=>"value" or "key"=>NULL. " and \ are escaped as \" and \\. Example: "a"=>"b", "c"=>NULL, "d"=>"q" binary: Int32 count for each item: Int32 keylen string of the key (not null-terminated) Int32 length of item (or -1 if null) the item as a string ghstore: internal type for indexing hstore domain types: mapped types used in information_schema: cardinal_number: int4 (must be nonnegative or null) character_data: varchar sql_identifier: varchar time_stamp: timestamptz yes_or_no: varchar(3) (must be "YES" or "NO" or null) intnotnull: when an int4 is cast to this type, it is checked that the int4 is not null, but it still returns an int4 and not intnotnull... ~~~