-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Standardizing conversions to string #14090
Description
Problem
There are many places in Nushell where values are turned into strings for one reason or another. Over time, this has led to many different implementations and behaviors. Most recently, this issue was brought up in #14086. Going foward, we should standardize the process(es) for turning values into strings and document it somewhere.
My current hypothesis (I need to go through the codebase and confirm) is that there are two or three main use cases for turning a value into a string:
- The value needs to be displayed in a human-readable format (e.g., in errors, for
table, or forformat). This should probably respect the current locale and formats specified in the config. - The value needs to be converted to string data for some other programmatic purpose (e.g.,
into string). This should probably have the same output regardless of config settings or environment variables. - The value needs to be sent to an external source. E.g., as an external command argument, converted to an OS environment variable, or as input to an external command. This is similar to
2)except that I think certain conversions should be disallowed as they are most likely an error, and having different behavior for others types makes more sense for this use case. To differentiate it frominto string,to textcan take up this role.
I think we can create a separate issue for 1) if necessary. I would rather start with 2) and 3), as I think 2) currently has the least uniform behavior and 3) is related to 2).
Rough Proposal
For 2), my current proposal is that we, for the most part, use to nuon. The difference being that all values are first converted to strings and then outputted as such (e.g., binary values are not formatted as binary literals). I.e., there is no guarantee that type information is preserved. In particular, types should be converted as follows:
| Type | Conversion |
|---|---|
| bool | Use Rust's built in Display implementation |
| int | Display |
| float | Display |
| filesize | Automatically detect unit precision. Leaning towards non-metric units so that decimal places can be used without losing precision. E.g., (1MB + 1B) | into string == '1.000001MB' |
| duration | Use ISO 8601 durations but also add a negative sign at the start if necessary. |
| date | Use .to_rfc3339() from chrono. This follows RFC 3339 but also allows out of range years as specified by ISO 8601 (otherwise some valid date values would not be representable as strings). To be precise, automatic subsecond precision is used and +00:00 is used instead of Z. |
| string | Simply used as is. |
| glob | Used as is. |
| binary | Used as is if valid utf-8. |
| cellpath | Prefixed with $. and with necessary quoting/escaping for each path member. |
| closure | Error/disallowed |
| range | [start]..[end] or [start]..<next>..[end] |
| nothing | null |
| list | [ val1, val2 ], adding quotes either where necessary or always (for strings, globs, and binary). |
| record | { foo: bar, baz: val }, adding quotes where necessary for columns/keys. |
Then, I propose 3) should be based off of 2) but with the following differences:
- Non-utf-8 binary values are allowed.
- Cellpaths are disallowed? These are nushell-specific data types, so probably does not make sense to convert it to a string for the purpose of sending to an external source.
- Similarly, ranges should probably be disallowed? Our range format is most likely not going to be interpreted properly by an external source.
- Records are disallowed too. Same as the above two, especially since records have escaping concerns and allow arbitrary nesting. Instead, users should use
to jsonor some othertocommand. - Lists that contain none of the disallowed types above and also do not contain other lists are allowed. Each value is converted to a string based on
2)and a new line is added after each element. For conversion to OS environment variables, the OS specific path separator is used instead (I believe;for windows and:for unix?). - Filesizes and durations I think can be converted the same as in
2)... - Nothing/
nullis a whole separate issue that might be worth discussing.
If you made it this far, thanks for reading lol. Would love to hear everybody's thoughts.