This is a Python 3 library implementing parsing and serialisation of RFC 9651.
Textual HTTP headers can be parsed by calling parse; the return value is a data structure that represents the field value.
>>> from http_sf import parse, ser
>>> parse(b"foo; a=1, bar; b=2", tltype="dictionary")
{'foo': (True, {'a': 1}), 'bar': (True, {'b': 2})}parse() takes a bytes-like object as the first argument. If you want to parse a string, please .encode() it first.
Because the library needs to know which kind of field it is, you need to hint this when calling parse. There are two ways to do this:
- Using a
tltypeparameter, whose value should be one of 'dictionary', 'list', or 'item'. - Using a
nameparameter to indicate a field name that has a registered type, per the retrofit draft.
Note that if you use name, a KeyError will be raised if the type associated with the name isn't known, unless you also pass a tltype as a fallback.
When parsing fails, a StructuredFieldError (a subclass of ValueError) is raised. This exception has attributes that can be used to debugging the error:
position: The character offset in the input bytes where the error was detected.offending_char: The character at the position where the error was detected.context: If the error occurred within a Dictionary or Parameter value, the key name.
>>> from http_sf import parse, StructuredFieldError
>>> try:
... parse(b"foo; bar", tltype="item")
... except StructuredFieldError as e:
... print(f"Error at {e.position}: {e}")
...
Error at 8: Parameter value definition expectedBy default, duplicate keys in Dictionaries and Parameters are overwritten by the last value, as per the specification. If you wish to detect when this happens, you can set a callback:
>>> from http_sf import parse
>>> def complain(key, context):
... print(f"Duplicate key: {key} in {context}")
...
>>> parse(b"a=1, a=2", tltype="dictionary", on_duplicate_key=complain)
Duplicate key: a in dictionary
{'a': (2, {})}In the returned data, Dictionaries are represented as Python dictionaries; Lists are represented as Python lists, and Items are the bare type.
Bare types are represented using the following Python types:
- Integers:
int - Decimals:
float - Strings:
str - Tokens:
http_sf.Token(aUserString) - Byte Sequences:
bytes - Booleans:
bool - Dates:
datetime.datetime - Display Strings:
http_sf.DisplayString(aUserString)
Inner Lists are represented as lists as well.
Structured Types that can have parameters (including Dictionary and List members as well as singular Items and Inner Lists) are represented as a tuple of (value, parameters) where parameters is a dictionary.
So, a single item that's a Token with one parameter whose value is an integer will be represented like this:
>>> parse(b"foo; a=1", tltype="item")
(Token("foo"), {'a': 1})Note that even if there aren't parameters, a tuple will still be returned, as in some items on this List:
>>> parse(b"a, b; q=5, c", tltype="list")
[(Token("a"), {}), (Token("b"), {'q': 5}), (Token("c"), {})]To serialise that data structure back to a textual Structured Field, use ser:
>>> field = parse(b"a, b; q=5, c", tltype="list")
>>> ser(field)
'a, b;q=5, c'When using ser, if an Item or Inner List doesn't have parameters, they can be omitted; for example:
>>> structure = [5, 6, (7, {"with": "param"})]
>>> ser(structure)
'5, 6, 7;with="param"'Note that ser produces a string, not a bytes-like object.
You can validate and examine the data model of a field value by calling the library on the command line, using -d, -l and -i to denote dictionaries, lists or items respectively; e.g.,
> python3 -m http_sf -i "foo;bar=baz"
[
{
"__type": "token",
"value": "foo"
},
{
"bar": {
"__type": "token",
"value": "baz"
}
}
]or:
> python3 -m http_sf -i "foo;&bar=baz"
FAIL: Key does not begin with lcalpha or * at: &bar=bazAlternatively, you can pass the field name with the -n option, provided that it is a compatible retrofit field:
> python3 -m http_sf -n "Cache-Control" "max-age=40, must-revalidate"
{
"max-age": [
40,
{}
],
"must-revalidate": [
true,
{}
]
}Note that if successful, the output is in the JSON format used by the test suite.