Conversation
|
Thank you @nojb for pushing this forward 👏. I'm excited. |
| and returns them in a new string. | ||
| Return [None] if the end of file is reached before [len] | ||
| characters have been read. *) | ||
|
|
There was a problem hiding this comment.
really_X is a terrible naming scheme IMO. Can we come up with a better naming scheme? How about input_n and input_string_n?
There was a problem hiding this comment.
I'm not married to the existing name, but input_n is not any better IMO.
There was a problem hiding this comment.
Let me make the case for it: it mirrors C's printf/snprintf scheme ie. it succinctly signifies that there is something different about the function, and it can easily be remembered that the difference concerns the number of input elements.
There was a problem hiding this comment.
please don't emulate anything from C's IO, it's terrible :p
I think Lwt has input_exactly or something to this taste, which is maybe more clear. I think input_n doesn't mean anything, and is worse than really_input which at least is familiar.
There was a problem hiding this comment.
input_exactly sounds a lot better than really_input. really_input sounds really... bad.
stdlib/in_channel.mli
Outdated
| Return [None] if the line read is not a valid representation of an integer. | ||
| *) | ||
|
|
||
| val read_line : unit -> string option |
There was a problem hiding this comment.
It is confusing to have both read_line : unit -> string option and read_lines : string -> binary:bool -> string list. Not sure what is the best way to handle this? 1) rename one of them, 2) omit one of them, 3) ...
There was a problem hiding this comment.
I agree. There's nothing to signify that read_lines concerns a file -- you have to go in and read the documentation. read_file_lines and read_file_all or file_read_lines and file_read_all would make more sense to me.
There was a problem hiding this comment.
Can we make read_* take t rather than unit? It's slightly more verbose when reading from stdin, but also explicit about where the data are coming from.
Or they could be named read_float_stdin, etc.
There was a problem hiding this comment.
@hcarty The functions taking t as argument are already there with prefix input_*. Another possibility is to put all the functions specialised to stdin, stdout and stderr in their own module, say Stdio.
stdlib/out_channel.mli
Outdated
| (** TBA *) | ||
|
|
||
| val write_all : filename:string -> string -> unit | ||
| (** TBA *) |
There was a problem hiding this comment.
write_all inconsistently has the filename arg, which I guess makes sense since you want to distinguish the arguments, but I think if it's here, it should go in the analogous functions.
I personally think these 2 should still be file_write_lines and file_write_all. A little longer, but you're saving on opening the file, and it helps clarify their purpose.
If you keep the argument name, it could be shortened to file for convenience -- it's obvious that it's a name from the type.
There was a problem hiding this comment.
I'd rather have the label remain filename because it's explicit. It can't be confused with file_content or anything else.
|
Given the amount of bikeshedding that went on on #640 it'd be nice to keep the discussion focused. The proposed API seems already pretty nice, so imho the discussion shouldn't go too far into proposing extensions to it. |
stdlib/in_channel.mli
Outdated
| This function has no effect under operating systems that | ||
| do not distinguish between text mode and binary mode. *) | ||
|
|
||
| (** {2 Input functions on filenames} *) |
stdlib/in_channel.mli
Outdated
| type 'a open_function = | ||
| ?create:bool -> (* default: false *) | ||
| ?binary:bool -> (* default: false *) | ||
| 'a |
There was a problem hiding this comment.
I understand the desire to have only the flags that matter for input channels, but I don't see this way of doing as a very forward looking way w.r.t. functional abstraction.
I'd rather have an open_flags type that can be extended later if needed. This allows clients to abstract an open-like function simply under the signature string -> In_channel.open_flags list -> In_channel.t. I suspect extending open_flags will break less code than extending open_function if that was to occur at some point in the future.
Another argument against this design is that this is the kind of information you may want to have in configuration files and/or read from user interfaces, have under other forms in your code to be mapped on that representation or simply have higher-level data structures that hold that information. In that case having a datatype for specifying it leads to much more convenient code to write for the client -- this especially applies for Out_channel.
Also understanding the TBA of with_file is that it will do the exception safety open/close dance, a bare open' : (string -> t) open_function should be provided.
Besides if you can create a file by reading you likely want to add a mode argument.
There was a problem hiding this comment.
To give a concrete example here is a function that uses its own defaults for (Unix) open flags but still allows the client to override them. It would be quite annoying to expose and implement this function with the type 'a open_function way of doing this PR suggests.
There was a problem hiding this comment.
Makes sense, I will switch back to having an open_flag type.
There was a problem hiding this comment.
I added a flag type back, with different flags for input and output channels. I left in the "non-blocking" flag even if it does not fit very well with an "option"-based API... I'm not sure what to do about that.
stdlib/in_channel.mli
Outdated
| val fold_lines : ('a -> string -> 'a) -> 'a -> t -> 'a | ||
| (** TBA *) | ||
|
|
||
| val input_all : t -> string |
There was a problem hiding this comment.
I would rather call this input_to_string.
stdlib/in_channel.mli
Outdated
| (** TBA *) | ||
|
|
||
| val read_all : string -> binary:bool -> string | ||
| (** TBA *) |
There was a problem hiding this comment.
What about read_iter_lines ? What about read_fold_lines ? etc. etc. I think these function are useless, writing with_file file input_lines isn't egregious (see @gasche's recent comment about "data structures that have delicate creation ceremonies"). Also the names clash with the convention for stdin input functions (but more on these below).
There was a problem hiding this comment.
They have been removed.
stdlib/in_channel.mli
Outdated
| val read_all : string -> binary:bool -> string | ||
| (** TBA *) | ||
|
|
||
| (** {2 Input functions on standard input} *) |
There was a problem hiding this comment.
These function feel ad-hoc. They bundle multiple effects and only work only on stdin I would move on to drop them.
There was a problem hiding this comment.
They have been removed.
stdlib/out_channel.mli
Outdated
| ?fail_if_exists:bool -> (* default: false *) | ||
| ?truncate:bool -> (* default: true *) | ||
| ?perm:int -> (* default: 0o644 *) | ||
| 'a |
There was a problem hiding this comment.
See discussion in In_channel. Flags are better to specify these things.
stdlib/out_channel.mli
Outdated
| 'a | ||
|
|
||
| val create : (string -> t) open_function | ||
| (** [create mode filename] opens the named file for writing, |
There was a problem hiding this comment.
create opens which may create a file or not... I'd rather have open'. We are working with system calls here not simply only creating the channel abstraction, I prefer if we make it clear which system call is being used here.
There was a problem hiding this comment.
I switched to open_ (could use open' as well, I just had forgotten about it).
stdlib/out_channel.mli
Outdated
| val write_lines : string -> string list -> unit | ||
| (** TBA *) | ||
|
|
||
| val write_all : filename:string -> string -> unit |
There was a problem hiding this comment.
filename doesn't make any sense here, it's not a filename it's a file path. Again this is simply with_file file output_lines, no need to provide this (especially, note that you didn't provide any of the many possible options for output file creation in these signature...).
There was a problem hiding this comment.
Agree, will remove these.
stdlib/out_channel.mli
Outdated
| val write_all : filename:string -> string -> unit | ||
| (** TBA *) | ||
|
|
||
| (** {2 Output functions on standard output} *) |
There was a problem hiding this comment.
I would elect to remove all of these stdin/stderr specific functions. I can't count the number of times I used them and then had to change my code because I wanted to perform my ouput on another channel (e.g. switching fromstdin to stderr). If you don't provide them people are more likely to write their code on a generic channel value which is good for them®.
There was a problem hiding this comment.
I tend to agree with this argument.
@alainfrisch opined that specifying the channel argument could be heavy, but am not really convinced to be honest.
I will remove all the std{in,out,err}-specific functions from this PR.
| This function has no effect under operating systems that | ||
| do not distinguish between text mode and binary mode. *) | ||
|
|
||
| val output_lines : t -> string list -> unit |
There was a problem hiding this comment.
output_line : t -> string -> unit symmetric to input_line would be nice to have.
There was a problem hiding this comment.
Thanks, I added this one as well.
|
Thanks for unlocking the situation regarding
|
The point of this PR is twofold:
With this in mind, I wonder what would be the point of duplicating old functions in the new modules? I don't think there are plans to deprecate the old functions in the near future. If you want to keep using them, there isn't anything keeping you from doing so. But for those that want to adopt the "new" functions we should offer a better API whenever possible.
The implementation of this function has already been discussed in #640. So I think that I will keep everything in one PR for now, but indeed if that poses problems we may end up splitting it eventually. |
Planning to keep supporting the current functions is another equally sensible strategy. Thanks for the clarification.
Ok, I am looking forward to reading it. |
|
Code that uses exceptions should still have a reasonable path forward, imho. Adding option variants, like |
stdlib/in_channel.mli
Outdated
| val create : t open_function | ||
| (** TBA *) | ||
|
|
||
| val with_file : (string -> (t -> 'a) -> 'a) open_function |
There was a problem hiding this comment.
I add it here, so I don't forget to re-check later.
If we accept the with_file functions (and I really welcome their introduction) we should mention that the user has to be careful and avoid capturing the environment, for example leaking no-longer open channels.
7198fe7 to
3e0ab8b
Compare
|
I fleshed out the |
| | Binary | ||
| | Non_blocking | ||
|
|
||
| val open_ : ?flags:open_flag list -> string -> t |
There was a problem hiding this comment.
I won't fight about this but I would rather use ' here. _ is meant to be a space. These final spaces always leave me wanting for more.
stdlib/in_channel.mli
Outdated
| val stdin : t | ||
| (** Standard output. *) | ||
|
|
||
| (** {2 General input functions} *) |
There was a problem hiding this comment.
Headings start at 1 in modules.
There was a problem hiding this comment.
Also I think it would be nice to uniformize the docs strings to the format "[func a0 a1] description" of the resulting value or the effect (for unit).
There was a problem hiding this comment.
Fixed, but the docstrings could use some more work I think.
| | Exclusive -> Open_excl :: flags, perm | ||
| | Non_blocking -> Open_nonblock :: flags, perm | ||
| in | ||
| let flags, perm = List.fold_left aux ([Open_rdonly], 0) flags0 in |
There was a problem hiding this comment.
Is 0 really the right default ?
stdlib/in_channel.mli
Outdated
|
|
||
| val open_ : ?flags:open_flag list -> string -> t | ||
| (** Open the named file for reading, and return a new input channel on that | ||
| file, positioned at the beginning of the file. *) |
There was a problem hiding this comment.
Create's default is not documented.
| | Binary -> Open_binary :: flags, perm | ||
| | Non_blocking -> Open_nonblock :: flags, perm | ||
| in | ||
| let flags, perm = List.fold_left aux ([Open_wronly], 0o666) flags0 in |
There was a problem hiding this comment.
Is 0o666 really the right default ? I would rather go for 0o644.
There was a problem hiding this comment.
On the other hand 0o666 is the one used by Stdlib.open_out, so there is a slight potential for confusion here.
There was a problem hiding this comment.
The permissions given to open are modified by the user's umask. 0o666 is the traditional "friendly" set of permissions. Most users have a 0o022 umask that will turn write permissions off for group and for others, resulting in the expected 0o644 permissions. But some users want their files to be group-writable by default, and have 0o002 umask, and there is no reason to annoy them by forcing the permissions to 0o644.
There was a problem hiding this comment.
Makes sense, thanks for the explanation. Will set the default back to 0o666.
| val open_ : ?flags:open_flag list -> string -> t | ||
| (** Open the named file for writing, and return a new output channel on that | ||
| file, positioned at the beginning of the file. The file is truncated to zero | ||
| length if it already exists. It is created if it does not already exists. *) |
There was a problem hiding this comment.
The default create permissions are not documented.
stdlib/out_channel.mli
Outdated
| do not distinguish between text mode and binary mode. *) | ||
|
|
||
| val output_line : t -> string -> unit | ||
| (** Output a line terminated by a newline character. *) |
There was a problem hiding this comment.
A discussion about which newline character that is may be useful here.
There was a problem hiding this comment.
Added a line about that.
stdlib/in_channel.mli
Outdated
|
|
||
| val input_line : t -> string option | ||
| (** Read characters from the given input channel, until a | ||
| newline character is encountered. Return the string of |
There was a problem hiding this comment.
A discussion about which newline we are talking about would be nice here.
There was a problem hiding this comment.
Added a line about that.
gadmm
left a comment
There was a problem hiding this comment.
The with_ functions should use the non-raising variants of close. Also, I see that you did not include close_noerr. I think it should be provided. Can we have a complete list of Pervasives functions which you are choosing not to include?
|
Hi @nojb, can we please have an update about this PR? A lot of time has been spent collectively, and it would be nice if somebody was able to continue the work. |
3e0ab8b to
87eaf41
Compare
Hi @gadmm, sorry for the delay in getting back -- I've been submerged with other things. I re-opened the PR and rebased it on the current What is left is to "reach a consensus" :) For reference, these are the main changes so far:
|
c92c793 to
3ee82be
Compare
stdlib/out_channel.mli
Outdated
| (** [with_file ?flags filename f] opens the file named [filename] for writing | ||
| according to [flags], invokes [f] to process the contents of that file then, | ||
| once [f] has returned or triggered an exception, closes the file before | ||
| proceeding. *) |
There was a problem hiding this comment.
I suggest to add All errors arising during closing are ignored. (idem for in in_channel.mli).
stdlib/in_channel.mli
Outdated
| | Non_blocking | ||
|
|
||
| val open_ : ?flags:open_flag list -> string -> t | ||
| (** [open_ ?flags fn] opens the file named [fn] for reading, and return a new |
There was a problem hiding this comment.
After "for reading" I would add "according to [flags] (defaults to [[]])." One thing that could be useful to document here aswell is the cloexec behaviour on these handles.
stdlib/in_channel.mli
Outdated
|
|
||
| val with_file : ?flags:open_flag list -> string -> (t -> 'a) -> 'a | ||
| (** [with_file ?flags filename f] opens the file named [filename] for reading | ||
| according to [flags], invokes [f] to process the contents of that file then, |
There was a problem hiding this comment.
After "according to [flags]" I would add "(defaults to [[]])".
stdlib/in_channel.mli
Outdated
| nothing when applied to an already closed channel. *) | ||
|
|
||
| val close_noerr : t -> unit | ||
| (** [close_noerr ic] is like [close ic] but ignores all errors. *) |
There was a problem hiding this comment.
maybe add "and never raises [Sys_error]" to make things clear.
stdlib/in_channel.mli
Outdated
| [input] must be called again to read the remaining characters, | ||
| if desired. (See also {!Stdlib.really_input} for reading | ||
| exactly [len] characters.) | ||
| Exception [Invalid_argument "input"] is raised if [pos] and [len] |
There was a problem hiding this comment.
I think it's sufficient to say that [Invalid_argument] is raised. The error message should not be relied upon (lots of valuable programmer time would actually be saved if these error messages were not so simplistic but indicated the offending ranges explicitely).
stdlib/in_channel.mli
Outdated
| (** [really_input ic buf pos len] reads [len] characters from channel [ic], | ||
| storing them in byte sequence [buf], starting at character number [pos]. | ||
| Return [None] if the end of file is reached before [len] characters have | ||
| been read. Raises [Invalid_argument "really_input"] if [pos] and [len] do |
There was a problem hiding this comment.
Same comment on [Invalid_argument].
stdlib/out_channel.mli
Outdated
| closing. *) | ||
|
|
||
| val close_noerr : t -> unit | ||
| (** [close_noerr oc] is like [close oc] but ignores all errors. *) |
There was a problem hiding this comment.
maybe add "and never raises [Sys_error]" to make things clear.
stdlib/out_channel.mli
Outdated
| [buf]. *) | ||
|
|
||
| val output_substring : t -> string -> int -> int -> unit | ||
| (** [output_substring oc s pos len] is like [output] but takes an argument of |
There was a problem hiding this comment.
is like [output] -> is like {!output}.
stdlib/out_channel.mli
Outdated
| machines for a given version of OCaml. *) | ||
|
|
||
| val seek : t -> int64 -> unit | ||
| (** [seek chan pos] sets the current writing position to [pos] for channel |
There was a problem hiding this comment.
Same comments as In_channel.seek
stdlib/out_channel.mli
Outdated
| results). *) | ||
|
|
||
| val length : t -> int64 | ||
| (** Return the size (number of characters) of the regular file on which the |
There was a problem hiding this comment.
[length oc] is the byte size of the...
| some translations may take place during output. For instance, under | ||
| Windows, end-of-lines will be translated from [\n] to [\r\n]. This function | ||
| has no effect under operating systems that do not distinguish between text | ||
| mode and binary mode. *) |
There was a problem hiding this comment.
See In_channel.set_binary_mode comment.
3ee82be to
428303c
Compare
|
Following discussion in the developer's meeting and in the spirit of trying to exercise a bit of PR discipline, I'm shutting down this one, mainly because of lack of time/energy to push it over the finish line. Perhaps someone else will decide to pick it up and take it foward. |
This PR is to make more concrete the discussion in #640 and try to converge.
In_channelandOut_channel.End_of_filein the current stdlib return anoptionin the proposed versionopen_flagwe use optional arguments in{In,Out}_channel.createstd{out,err}are inOut_channel, those onstdinare inIn_channel.seek,posandlengthwork withint64(no moreLargeFilemodule)In_channel.{with_file, read_all, read_lines, input_lines, fold_lines}are newOut_channel.{with_file, write_all, write_lines}are newOnly the interface is being included for now, as the implementation is copy/paste from the current stdlib for the most part.
Discuss!
@alainfrisch @c-cube