Conversation
039d298 to
b257c8f
Compare
b257c8f to
a481519
Compare
|
GNU testsuite comparison: |
|
Could you please add an integration test in tests/by-util/test_csplit.rs ? |
|
GNU testsuite comparison: |
tertsdiepraam
left a comment
There was a problem hiding this comment.
Nice work! This looks pretty good, but unfortunately duplicates a lot of the printf style formatting that is also present in other utils. If we keep separate implementations, we end up with incompatibilities and code that is hard to maintain.
The GNU docs say:
Use format as the output file name suffix. When this option is specified, the suffix string must include exactly one printf(3)-style conversion specification, possibly including format specification flags, a field width, a precision specification, or all of these kinds of modifiers. The format letter must convert a binary unsigned integer argument to readable form. The format letters ‘d’ and ‘i’ are aliases for ‘u’, and the ‘u’, ‘o’, ‘x’, and ‘X’ conversions are allowed. The entire format is given (with the current output file number) to sprintf(3) to form the file name suffixes for each of the individual output files in turn. If this option is used, the --digits option is ignored.
Source: https://www.gnu.org/software/coreutils/manual/html_node/csplit-invocation.html
So instead of reimplementing it, we should probably use our printf code. In particular, you could look how the printing code in seq is implemented. I think that is very similar, apart from the fact that we're printing integers instead of floats here.
See the seq code here:
coreutils/src/uu/seq/src/seq.rs
Lines 111 to 117 in 8e83b34
And the corresponding code from our formatting code:
Sorry, I wish we could have alerted you sooner about that functionality! Nevertheless, the tests you've made will be very useful to see if the printf version actually works!
|
edit: fixed bad copy-paste. c printf gives indeed Thanks for the insights ! I tested the existing However #include <stdio.h>
int main()
{
printf("before %#15.6o after", 42);
return 0;
}
gives:
before 000052 aftercargo run printf "before %#15.6o after", 42
before 0o52 after
givescurrent formatting
I may have missed something, but from what I see, the uucore/src/lib/features/format would need deeper changes to correct both printf and csplit implementation. |
|
Nice find! You're right about those incompatibilities, but we should fix those in By the way, I get this from both GNU Maybe we could open separate issues for each of these problems. |
|
@spineki, I'm gonna close this since I think we agree this should be handled by the |
|
Where were we with fixing the We've had:
What remains to be done? Edit: Looks like this is still broken: Expected: uutils: |
Summary
This pull request adds support for the
precisionsyntax in Csplit filename-suffix patterns (-bflag).Example: to create files where the suffix is 10-elements long, padded to the left (-), with 3-digit precision (.3), in lower hex "x"
Additionally, it allows user to use
#,0flags at the same time.Csplit allows split files to be named with fixed precision, independent from outer width.
Initial Issue
The initial example in the linked issue is now fixed. You can check the expected behaviour with the following
Remarks
C and Rust format syntaxes differ when dealing with zero precision, octal prefix or precision in case of alternative mode (C doesn't take the extra
0xorointo account while Rust does).To avoid redundant code, I split the extraction in two formatting steps. First, creating the inner formatted number, with a length of
precision, then, adding additional padding, filling and direction for a total length ofwidth.There are a lot of corner cases, don't hesitate to point something I may have missed !
Related Issue
#5709
Tests
I added additional tests to check patterns involving multiple flags or single-dot-precision syntax.
I modified some tests because octal notation is not formatted the same way in Rust or in C. (
0in front ofnin C,oin front ofnin Rust)