List and array comprehensions#46
Conversation
cf5d271 to
ed21e46
Compare
|
I implemented immutable arrays on top of this patch, and I'm fully convinced that the |
goldfirere
left a comment
There was a problem hiding this comment.
- I really like the user-facing design overview. Please include the BNF of the new syntax there, as it makes a nice specification.
- Why the inability to mix lists and arrays in different clauses? (Don't just answer here, but add a sentence or a half to the description.)
- About what to do about other extensions: I think we should evaluate this thunk lazily. That is, I like the design you're using here, and I think we should continue to use this design pattern for future extensions. But there isn't all that much advantage to consistency with other extensions (an advantage, yes, but not a huge one) -- especially if those other extensions are already working. If, in time, we decide that we like your approach here so much, we can go back and fix the others. In effect, this is your option (1).
- Remove
Turpitude. :) - Is there a nice abstraction here over a growable collection? I think so. And I think such an abstraction is nice, in general.
- I'm a bit horrified about overflow checks. That is, it takes some
careful thinking to determine that these checks are sound and
complete. Is there a better way? It's a bit shocking that OCaml
doesn't provide primitives that allow you to check a processor's
overflow flag. ... Well, it seems other languages also have this
flaw. Why, oh why? I feel like there has to be good local expertise
on this point. - Why
append !array !array? Seems that it has an unnecessary memcpy. - The resize condition should be
!index >= !array_size, no? - Regardless of general abstraction above, there should be a resizeable array type.
- What if a user doesn't want to trim the final array? Maybe it's short-lived. (Or maybe array trimming just updates the length and doesn't realloc/copy) (sadly, the docs don't say).
- For list comprehensions: can't we have a mode-directed desugaring, allowing for the possibility of a local output mode? Maybe we don't need this in v1, but I see no reason we can't do this.
- Presumably immutable arrays (something you know something about) are
allowed on the stack? Then your "only global arrays" comment has a
short lifetime. In any case, the whole analysis should be the same
as the lists case. - I would explicitly label the "Related work" section as an appendix
to the design document. Reading and understanding it is not
necessary in order to comprehend comprehensions. - The actual appendix about individual files is more detailed than
necessary; this was probably not worth the time you invested in
writing it. - As discussed, there is more flexibility around locality of list
comprehensions than the analysis here allows for. Perhaps revise the
text to talk about potential future expansion.
lambda/transl_array_comprehension.ml
Outdated
| ?variable_name:string -> loc:scoped_location -> lambda list -> lambda | ||
| end = struct | ||
| let raise_overflow_exn ~loc = | ||
| (* CR aspectorzabusky: Is this idiomatic? Should the argument to [string] |
There was a problem hiding this comment.
@mshinwell may want to answer this question and the next.
There was a problem hiding this comment.
If you have the location information, it's always best to put it on the terms, so this seems fine.
There was a problem hiding this comment.
@mshinwell Can you also look at the question a few lines down about Translprim.event_after?
| so bindings are just like iterators with a possible annotation. As a | ||
| result, this function is essentially the same as [iterator], which see. *) | ||
| let binding ~transl_exp ~scopes { comp_cb_iterator; comp_cb_attributes = _ } = | ||
| (* CR aspectorzabusky: What do we do with attributes here? *) |
There was a problem hiding this comment.
The old code for type-checking for loop indices drops attributes -- they're never looked at. So it's reasonable to do the same for range iterators. But in should preserve them. It looks like Translattribute.add_function_attributes is a good starting place.
lambda/transl_list_comprehension.ml
Outdated
| ; loc | ||
| ; mode = alloc_local | ||
| ; region = true | ||
| (* CR aspectorzabusky: We want one region per iterator, right? *) |
There was a problem hiding this comment.
I don't see why we would...
There was a problem hiding this comment.
I think I don't have a good intuition here. Say more?
There was a problem hiding this comment.
Well, because regions are invisible to users, it's important to let them get some intuition behind them. In other words, copy whatever is done for for loops, as we should just do the same here. That might be one region per iterator, indeed.
There was a problem hiding this comment.
Verdict: Do whatever for loops do
| end | ||
| end | ||
|
|
||
| module Cps_utils : sig |
There was a problem hiding this comment.
Dare I say it, but these should probably go in Misc?
There was a problem hiding this comment.
If they go in Misc, should I mark them as more generally useful than transl_comprehension_utils.mli? Or perhaps they could go in a non-Misc file but with a more general name?
| @@ -0,0 +1,164 @@ | |||
| open Lambda | |||
There was a problem hiding this comment.
Maybe this file should live in the lambda/ directory -- it doesn't seem comprehensions-specific?
937ad8a to
b9b834d
Compare
|
@goldfirere: I've gone through and responded to your big block of comments, and will be tackling the ones left on pieces of code next.
Done
Done
Agreed, yeah
Done :-)
I think this refers to one of the things discussion has indicated we're not doing for v1, but I'm not sure which.
As discussed, this is fine for now
It's not clear to me what's better, since we have to fill in the second half of the array regardless for GC reasons. Maybe
This is about the code sample in the leading block comment, right? I fixed this to be
As we discussed, I think this isn't worth pulling out here, especially since we take some advantage of the fact that we're just monkeying with two different kinds of arrays. If we find ourselves needing this elsewhere, then I'd revisit it.
Array trimming has to copy (mutability, and also OCaml arrays are nonresizable in general), but I don't think there's a way around this. If you didn't resize the array you'd have garbage values after the end, but those would be legal indices. OCaml doesn't have support for subarrays at all.
Yes, this is right (both that we can but aren't in v1). Though we should probably make it happen. (I could be persuaded that we need it in v1.) I've updated the comment.
This sounds right to me. I've updated the comment.
Sensible, done.
Yeah, I overgeneralized from the original PR I wrote. Deleted.
Does this refer to the text in |
|
Oh, and I've rebased this onto 4.14! |
Yeah -- I was thinking about some kind of interface which allows implementations for both a growable array and an append-list. But let's not do this without a stronger incentive to.
See Slack thread on Dec 13 at 4:07am Eastern time in the internal compilers channel. Leo concluded that a different approach might indeed be better.
✔️
Well, blargh. It should. At least for immutable arrays (I agree mutable ones don't like this idea). But even in the mutable-array case for comprehensions, a non-copying trim is perfectly safe. Maybe make a post in dev-tools ideas about array slicing for immutable arrays? Right now (if memory serves) there's no planned new interface for immutable arrays over mutable ones. But slicing is something that we could add to immutable ones. (This might mean more fundamental support for them, which in turn might not be worth building.) I wonder if there are other bits of functionality an immutable array can support that a mutable one can't. And once we have Leo's full-glory uniqueness story, we could imagine making slices of a mutable array, as long as the slices are immutable and there are no concurrent modifications to the underlying mutable array. That's maybe going too far, but I do think finding a way to query the firm about the utility of immutable slicing is a good idea -- and a dev-tools ideas post seems as good a way as any to do this.
Great. Can you collect this idea (and the few others that we agreed to defer) and gather them somewhere? I'm imagining making some kind of Jira ticket with leftovers. Maybe we'll never get to them, but I hate losing ideas. By collecting them, they might make good fodder for a warm-up project for a new hire or some work for an intern. |
d1c8b38 to
da53cff
Compare
|
At this point, everything outstanding here could use eyeballs from somebody who has more understanding about specific parts of the code (@mshinwell, @lpw25, and/or @stedolan), but we're nearly ready to merge besides that! |
78e8790 to
e94709c
Compare
e94709c to
1f72aaa
Compare
|
I rebased and renamed the extension to |
- runtime: Already_scanned optimisation is not supported - runtime: Backwards local pointers are not possible - lambda: make function with no cases fatal - typing, lambda: remove various already-fixed FIXMEs - lambda: remove / convert to TODO some primitives FIXMEs - typing, lambda: remove local lazy support - selectgen: allow region fusion across try
Comments for
ocaml-jstreviewI’ve written this PR in the same way I would if it was going to go upstream, both because I think that makes for good PRs and in the hopes that it can make that journey Soon™. However, there are four exceptions to that:
There are still some places in the code where I’m not sure I’m doing the right thing or we have to make design choices I’m not sure about or similar. I’ve marked these all with
CR aspectorzabuskycomments in the code; they’re mostly fairly small, and I have a section, “Outstanding issues”, that goes over the major themes. These will all need to get resolved during review.Since we’re not going upstream, we have to do the obnoxious “translate our new AST into
[%extension]s and back” dance so as not to break every ppx ever. I talk about that in “Code structure”, specifically “Syntax without syntax”, below, as well as in “Appendix: Detailed, file-by-file changes”; obviously, that will all go away when this goes upstream. Similarly, the diff againstocaml-jstis going to be different than the diff against upstream, since there was a prior prototype of comprehensions (that we mostly discarded); this means I’ll have to update “Appendix: Detailed, file-by-file changes” for that, too.On the other hand, since we’re not going upstream, I got to use local allocations when defining the structures in
CamlinternalComprehension, to get efficiency gains; this will have to go away before we upstream it (unless we wait a while), as will the discussion of it.Since this PR will get submitted a second time, I’ll also happily take any review comments about its text if they’re on offer 🙂
Overview
This PR adds list and array comprehensions to OCaml, enabled by the
-extension comprehensionscommand-line flag. Comprehensions are a syntactic form for cleanly building lists and arrays (hereinafter referred to collectively as sequences), based on mathematical set-builder notation. Here are some examples:For more examples, see the tests in
testsuite/tests/comprehensions/{list,array}_comprehensions_pure.ml, as well as the other test files in that directory.Syntax and semantics
The general form of a (list) comprehension is
(Array comprehensions differ only in being surrounded by
[| ... |]instead of[ ... ].) Breaking this down:The body is an expression that computes the values of the resulting sequence. Examples above include
a,b,candPrintf.sprintf "a %s %s" adjective noun.The various things that can come after a
foror anandare called iterators, and they generate values and bind them to patterns. Any variables bound by these patterns are in scope to the right of the wholefor ... and ...clause, as well as in the body of the comprehension. Examples above include (a)a = 1 to 10, (b)adjective in [|"red"; "yellow"; "green|], and (c)x = 5 downto 0.PAT in SEQform (example (b)) iterates over a sequence, and binds each value in that sequence to the specified pattern. List comprehensions iterate over lists and array comprehensions iterate over arrays; the two cannot be mixed. A future extension might provide a way to iterate over multiple types of sequences, but until we can specify this, we use the type of the comprehension to guide inference.functions from [CamlinternalComprehension]; as regular OCaml functions,
they cannot cannot have the desired (or any) mode polymorphism.
VAR = LOW to HIGHandVAR = HIGH downto LOWforms (examples (a) and (c), respectively) iterate over inclusive ranges ofints, just as they do inforloops; thetoform counts up fromLOWtoHIGH, and thedowntoform counts down fromHIGHtoLOW. IfLOW > HIGH, these iterators are empty and will never produce any values.The
whenclauses specify conditions on the values being iterated over; the body, as well as any clauses to the right of awhen, are only evaluated on iterations when the condition, abool, istrue. Examples above includewhen a*a + b*b = c*candwhen x <> 0.The result of the comprehension is a sequence consisting of the value of the body expression evaluated with the iterator patterns bound to every possible combination of values from the iterators (the Cartesian product) such that all the conditions hold. The order of the result is given by evaluating the clauses (
for ... and ...andwhenalike) in order from left to right; they may be thought about as nested loops/conditionals.In BNF form (suitable for being added to the grammar of OCaml), the grammar of comprehensions is:
A special-case optimization for simple (fixed-size) array comprehensions
One somewhat obscure choice is the decision to allow multiple iterators in a single clause, via
for ITERATOR_1 and ITERATOR_2 and ... and ITERATOR_N, rather than just having the user writefor ITERATOR_1 for ITERATOR_2 ... for ITERATOR_N. The semantics of the two are the same, except for precisely where the variables are in scope (cf.let BINDING_1 in let BINDING_2 in BODYvs.let BINDING_1 and BINDING_2 in BODY). However, the real advantage of theandconstruction is not this symmetry withlet, but is rather an optimization for array comprehensions: if an array comprehension contains exactly one clause, and it’s afor ... and ...clause, then we can allocate an array of exactly the right size up front (instead of having to grow the generated array dynamically, as we usually do). We call this the fixed-size array comprehension optimization. We cannot do this with nestedfors, as the sizes of iterators further to the right could depend on the values generated by those on the left. Thus, the second example above (“Let’s describe some objects”) would preallocate an array of 6 strings and loop through it, whereas the fourth example (“Flatten a nested array”) would have to start with a smaller array and grow it as necessary. (Lists, of course, are always built up the same way, as the only thing we can do with a list is iterate over it and the only way we can produce a list is to build it up piece by piece.)The specifics of evaluation order
The clauses (
for ... and ...andwhen) of a list comprehension are guaranteed to be evaluated from left to right, and all the sources of values for iterators in a singlefor ... and ...clause are guaranteed to be evaluated exactly once per surrounding iteration before any of them begin to iterate.To be more precise, the evaluation happens in the following order:
for ... and ...clause, first the sources of values for each iterator are evaluated, and then the iterators are iterated over from left to right.PAT in SEQiterator, the expressionSEQis evaluated.VAR = START to/downto STOPiterator, the expressionSTARTis evaluated before the expressionSTOP.when CONDclause is evaluated, the expressionCONDis evaluated; if it evaluates tofalse, the current iteration is terminated and the innermost surrounding iterator advances to its next value. No clauses further to the right are evaluated, and nor is the body.whenclauses have evaluated totrue), the body is evaluated, and the result is the next element of the resulting sequence.To see what this looks like for some concrete examples, you can look at the output for the tests in
testsuite/tests/comprehensions/{list,array}_comprehensions_side_effects.ml, which is stored in the corresponding.referencefiles.Code structure
There are three major pieces to the structure of this code:
Wedging comprehensions into the standard OCaml syntax tree without adding any new constructors to the AST;
The surface syntax and type checking of list comprehensions, and;
The code for desugaring comprehensions into Lambda.
We’ll discuss each of these in turn.
Syntax without syntax
Much of the details of how we interact with the standard OCaml AST are presented in a comment in
parsing/extensions.ml; I’ve duplicated a lot of that text here, as it’s relevant to the design. As we’ve started to work on syntactic extensions to OCaml, three concerns arose about the mechanics of how we wanted to maintain these changes in our fork.We don’t want to extend the AST for our fork, as we really want to make sure things like ppxen are cross-compatible between upstream and
ocaml-jst. Thankfully, OCaml already provides places to add extra syntax: extension points and annotations! Thus, we have to come up with a way of representing our new syntactic constructs in terms of extension points (or annotations, but we went with the former).We don’t want to actually match on extension points whose names are specific strings all over the compiler; that’s incredibly messy, and it’s easy to miss cases, etc.
We want to keep different language extensions distinct so that we can add them to upstream independently, work on them separately, and so on.
We have come up with a design that addresses those concerns by providing both a nice compiler-level interface for working with our syntactic extensions as first-class AST nodes, as well as a uniform scheme for translating this to and from
Parsetree.expressionvalues containing extension points.For each language extension, we define a module (e.g.,
Comprehensions), in which define a proper AST type (e.g.,Comprehensions.comprehension_exprand its subcomponents). This addresses concern (3); we’ve now contained each extension in a module. But just that would leave them too siloed, so…We define an overall auxiliary AST that’s just for our language extensions,
extension_expr; it contains one constructor for each of the AST types defined as described in design point (1). This addresses concern (2); we can now match on actual OCaml constructors, as long as we can get ahold of them. And to do that…We define a general scheme for how we represent language extensions in terms of the existing AST, and provide a few primitives for consuming/creating AST nodes of this form. There’s not a lot of abstraction to be done, or at least it’s not (yet) apparent what abstraction there is to do, so most of this remains manual. (Setting up a full lens-based/otherwise bidirectional approach sounds like a great opportunity for yak-shaving, but not actually a good idea.) This solves concern (3), and by doing it uniformly helps us address multiple cases at one stroke.
We then bundle this all up for each individual extension into a type containing two different (partial) isomorphisms: the fully isomorphic (up to exceptions) ability to lift and lower between the custom AST type (from design point (1)) and existing AST expressions, leveraging the common format for representing thins in the existing AST from design point (3); and the partial ability to lift and lower between the custom AST type and our overall auxiliary AST type (from design point (2)), which is just a constructor application in one direction and a pattern match against a constructor in the other. This type is an existential type that hides the extension-specific type, allowing us to collect all of our extensions together:
We then provide a mapping from the extension enumeration in
Clflagsto ourextensiontype, and use this to provide the following two entry points into our language extension machinery:The first function,
expr_of_extension_expr, moves from our auxiliary AST into the existing AST, but only at the specified extension; if there is a mismatch between the extension and the constructor ofextension_expr, an exception is raised. We don’t provide any sort of automatic lookup, since we expect to only be calling this function at statically known constructors. The second function,extension_expr_of_expr, goes the other way; it checks the current AST node to see if it’s coming from a language extension (leveraging our uniform representation in the existing AST), and returns an element of our overall auxiliary AST if so, which can then be matched on as normal; if the AST node doesn’t look like that, it returnsNone. This function will raise an exception if the language extension is disabled (note that this is unlikeexpr_of_extension_expr), if this AST node looks like a lowered language extension but is from an unknown extension, or the situation is otherwise malformed.We expose only these two entry points, as well as all the ASTs; this allows us to keep the gory details hidden exclusively in
parsing/extensions.ml.How do we represent language extensions in the existing AST?
Speaking of gory details, we adopt the following scheme for representing our language extensions in the existing AST: all of our language extensions are to be rendered as applications of extension nodes to another expression. In particular, for a given extension named
EXTNAME(i.e., one that is enabled by-extension EXTNAMEon the command line), any syntax it introduces ought to be desugared as([%extension.EXTNAME] EXPR)for someEXPR. We also provide utilities for further desugaring similar applications where the extension nodes have the longer form[%extension.EXTNAME.ID₁.ID₂.….IDₙ](with the outermost one being then = 0case); these might be used inside theEXPR. (For example, within the outermost[%extension.comprehensions]application, we also have[%extension.comprehensions.list],[%extension.comprehensions.array],[%extensions.comprehensions.for.in], etc.) We don’t use the extension node payload so that ppxen can see inside these extension nodes; if we put the subexpressions inside the extension node payload, then we couldn’t write something like[[%string "Hello, %{x}!"] for x in names]], asppx_stringwouldn’t traverse inside the payload to find the[%string]extension point. Language extensions are of course allowed to impose constraints on what the contained expression is; we’re also happy for this to error in various ways on malformed input, since the nobody should ever be writing these forms directly. They’re just an implementation detail.Within
parsing/extensions.ml, we provide some simple machinery for working with these([%extension.EXTNAME.ID₁.ID₂.….IDₙ] EXPR)wrapped forms. To construct one, we provide the functionwhere the list argument contains the
EXTNAMEand all theIDᵢs, and the expression argument contains theEXPR. Symmetrically, to check if a given AST node is a special application of that form, we provide the functionwhere the list and the expression in the result are once again the
EXTNAMEand all theIDᵢs alongside theEXPR. This function doesn’t always have anything to detect, so it returnsNoneif applied to an innocuous node; if it detects a malformed application (e.g., too many arguments) or an extension point with a payload, it raises an exception.We still have to write the transformations in both directions for all new syntax, lowering it to extension nodes and then (somewhat more obnoxiously) lifting it back out. The
Comprehensionsmodule represents the pattern for doing this that we had in mind, but there’s, again, no particular abstraction involved.One other issue with representing language extensions in the AST
I wrote this file before other extensions existed, and they grew separately from the approach here. This is particularly true because my extension is the only one that adds a big lump of new syntax. This means it’s not trivial to unify the approach taken in this file and the approach generally taken by extensions. For instance, I’m not sure if everything even has the
([%extension.foo] bar)form of being a single application. I thinkinclude functoris represented with an attribute, and then local modes are represented both with attributes and extension points, and we surface the attributes generally. With locals, I know we allow all three of[@extension.local],[@ocaml.local]and[@local], but I’m not sure if that’s still true for the extension point variant. And I’m not sure what[%extension.escape]and[%extension.curry]are. So what should I do? I think there are four good options.The current, punted-to-the-future status quo, but treated as a choice instead of a question: leave this file as-is, use it when writing future extensions, and leave
uniformly_handled_extensionin place. We now have a split between extensions that are handled with this file and without it, and not in a principled way. This is the simplest option and requires the least work, so it has a lot to recommend it; however, it leaves the code base in a messy state. But that messy state provides a clear upgrade path to options (4) or (3) (see below).Rename this file to
parsing/extension_comprehensions.mlor something similar and just specialize it to just work with comprehensions. This puts the code base in a clean state and requires almost as little work as (1), so it’s also good from the perspective of getting stuff out the door without leaving ourselves with technical debt. But it leaves us at sea for future extensions. On the other hand, this is the best (really, the only) option for the world where we think that a design like this is a bad idea or isn’t actually worth it, so it has the potential to win for that reason!Fully generalize this file to support the existing forms of extension points, and maybe also annotations. This gets us a uniform interface for adding language extensions, but requires the most work; for instance, I’d have to generalize this to the module and the type language, and even worse, I’d have to build an actual AST for locals. The upside is that it would centralize our concerns about dealing with attribute nonsense, which is both clearer and might get us better diffs later. On the other hand, we might still have to special-case locals, since the attributes are user-visible and thus need to be handled directly; even if we parse them via this framework, we have to special-case their names and the fact that they’re unconditionally parsed. Still, the more syntax we have, the nicer an approach like this is.
Rename this file to something like
parsing/syntactic_extensions.mland only use it for things like comprehensions andinclude functorthat slot very neatly into the AST (although it’s in the module AST…). The upside to this is that it probably gets some of the pros of both (1) and (2) while losing some of the cons; the downside is that it probably gets some of the cons of both (1) and (2) while losing some of the pros. Hard to weigh.I’d quite like to do (3), as we’ve had bugs about missing attributes, but I’m worried it will be more trouble than I expect and more trouble than it’s worth. Thoughts?
Parsing and type checking comprehensions
We parse comprehensions in the usual way; the only thing to note is that we return items of our extended AST until the final rule, which calls
Extensions.expr_of_extension_exprto lower this into the existing AST. We add real first-class constructors and data types for comprehensions toTypedtree, however; as no external programs depend on this representation, it’s safe to change it in incompatible ways. Typechecking comprehensions is similarly straightforward; there are no unusual binding rules or anything else. The existence of new constructors does also lead to adding these simple cases in various places throughout the typechecker; again, however, none of this requires subtle reasoning.Compiling comprehensions
More interesting than the static semantics is the dynamics semantics. List and array comprehensions are compiled completely differently; the former is compiled in terms of some internal pre-provided functions, and the latter is compiled as a series of nested loops. In both cases, we have to think about how these constructs interact with our locality extension as well.
Compiling list comprehensions
List comprehensions are compiled in terms of “reversed difference lists”. A difference list in general is a function from lists to lists; by “reversed”, we mean that these lists are stored backwards, and need to be reversed at the end. We make both these choices for the usual efficiency reasons; difference lists allow for efficient concatenation; they can also be viewed as based on passing around accumulators, which allows us to make our functions tail-recursive, at the cost of building our lists up backwards. An additional choice we make is to build all these intermediate data structures on the stack, by marking them
local_; again, this is for efficiency, as it means we don’t need to get the structure of these difference lists involved with the garbage collector. Since we can only generate global lists with list comprehensions (see “Comprehensions and locality”), we need a type that is spine-local but element-global; we thus define a custom type of such snoc lists, and define our difference lists in terms of that, in the internal moduleCamlinternalComprehension:We then work exclusively in terms of
local_ 'a rev_dlistvalues, reversing them into a globallistonly at the very end.We desugar each iterator of a list comprehension into the application of a tail-recursive higher-order function analogous to
concat_map, whose type is of the following form:...iterator arguments... -> local_ ('elt -> local_ 'res rev_dlist) -> local_ 'res rev_dlistHere, the
...iterator arguments...define the sequence of values to be iterated over (theseqof afor pat in seqiterator, or thestartandendof afor x = start to/downto enditerator); the function argument is then to be called once for each item. What goes in the function? It will be the next iterator, desugared in the same way. At any time, awhenclause might intervene, which is simply desugared into a conditional that gates entering the next phase of the translation.Eventually, we reach the body, which is placed into the body of the innermost translated function; it produces the single-item reversed difference list (alternatively, snocs its generated value onto the accumulator). Because each function is analogous to
concat_map, this builds up the correct list in the end. The whole thing is then passed into a reversal function, building the final list.For example, consider the following list comprehension:
This translates to the (Lambda equivalent of) the following:
The translation, which lives in the module
Transl_list_comprehension, does this in a fairly direct way; the only subtlety is that it needs to translate all the components (iterators,whenclauses, the body) before it can stitch them all together into a single term. The types and functions we desugar these to are defined in the moduleCamlinternalComprehension.Compiling array comprehensions
Array comprehensions are compiled completely differently from list comprehensions: they turn into a nested series of loops that mutably update an array. This is simple to say, but slightly tricky to do. There are three major sources of complexity:
We need to have a resizable array, as most array comprehensions have an unknown size (but see point (2)); however, OCaml arrays can’t grow or shrink, so we have to do this ourselves.
We need to perform the fixed-size array comprehension optimization (see the section “A special-case optimization for simple (fixed-size) array comprehensions”), which requires detecting that we only have one
for ... and ...clause and handling things specially in that case; this ends up getting its tentacles throughout the entire module, as we want to share a lot of the code but have to parameterize it over these two kinds of output.We have to handle the float array optimization, so we can’t simply allocate arrays in a uniform way; if we don’t know what’s in the array, we have to carefully handle things on the first iteration. These details are more local in scope, but particularly fiddly.
In general, the structure is: we allocate an array and a mutable index counter that starts at
0; each iterator becomes a loop;whenclauses become anifexpression, same as with lists; and in the body, every time we generate an array element, we set it and increment the index counter by one. If we’re not in the fixed-size array case, then we also need the array to be growable, the first source of extra complexity; we keep track of the array size, and if we would ever exceed it, we double the size of the array. This means that at the end, we have to use a subarray operation to cut it down to the right size.In the fixed-size array case, the second source of extra complexity, we have to first compute the size of every iterator and multiply them together; in both cases, we have to check for overflow, in which case we simply fail. We also check to see if any of the iterators would be empty (have size
0), in which case we can shortcut this whole process and simply return an empty array. Once we do that, though, the loop body is simpler as there’s no need to double the array size, and we don’t need to cut the list down to size at the end. This has ramifications throughout the translation code, as we have to add a bunch of extra special-case logic to handle this: we have to store enough information to be able to compute iterator sizes if we need to; we have to be able to switch between having a resizable and a fixed-size array; we don’t need to introduce the same number of variable bindings in each case; etc. Various bits of the code make these decisions (for these examples: theIterator_bindingsmodule; theinitial_arrayandbodyfunctions; and theUsagemodule, all intransl_array_comprehension.ml).Finally, handling the float array optimization also affects the initial array and the element assignment (so this ends up being a locus for all the sources of complexity). If the array has an unknown array kind (
Pgenarray), then we can’t allocate it with nonzero size without having the first element! Thus, no matter whether we are in the normal case or the fixed-size case, we have to set the initial array to be completely empty. Then, on the first iteration through the loop, we can finally create the real array, by allocating either the initial values for a resizable array or precisely enough values for a fixed-size array and setting all of them to the newly-computed first element of the resulting array. The initial array creation is done by the functioninitial_array, and the index checking is done (among other things) by the functionbody.To see some examples of what this translation looks like, consider the following array comprehension, the same as the list comprehension we had before:
This translates to the (Lambda equivalent of) the following:
On the other hand, consider this array comprehension, which is subject to the fixed-size array comprehension optimization:
This translates to the (Lambda equivalent of) the following rather different OCaml:
You can see that the loop body is tighter, but there’s more up-front size checking work to be done.
The code for doing this translation lives in the module
Transl_array_comprehension.Comprehensions and locality
What modes should comprehensions use? Let us be generic over the sequence type we use for comprehensions, calling it
sequence(standing for eitherlistorarray) and writing[? ... ?]. If we ignore modes, we may consider a comprehension as having been typechecked per the following modeless rule:To reason about modes, we have to separately consider the modes of
body,x,seq,i,low,high,cond, and the entire comprehension.The modes of
i,low,high, andcondare simple: We may be polymorphic in each of them individually. Asintandboolare immediates, values of these types may freely be used at any mode. We thus don’t need to consider these modes any further.The modes of
xandseqmust be the same as each other, as we do not distinguish between the “spine mode” and the “value mode” for types; a list or array must be as local or as global as its elements are. (If these were separate concepts, we could unconditionally a list or array “spine-locally”, and handlex’s mode separately.) We’ll refer to this as the “input mode” below.By the same token, the modes of
bodyand the entire comprehension must be the same as each other, as we are generating a sequence made up of the result of evaluatingbodyrepeatedly. We’ll refer to this as the “output mode” below.The input mode must be below the output mode. Clearly, the two can be the same as each other; and if the input is local, then the output surely cannot be global, as it can refer to the input and we cannot have heap pointers pointing into the stack. However, if the input is global, then it is perfectly safe for a local sequence to contain references to it, and so there is no harm in allowing the output to be local.
Thus, the question turns on what mode we are to use for the output, the mode of
bodyand the entire comprehension. While it would be nice to be polymorphic here, we are unfortunately currently constrained to check comprehensions at global mode. The reasons for this are separate for list and array comprehensions:For list comprehensions: List comprehensions are desugared in terms of functions from
CamlinternalComprehension; as regular OCaml functions, they cannot cannot have the desired (or any) mode polymorphism. The input can either be marked aslocal_(in which case we cannot create a global list with a list comprehension) or not (in which case we cannot use list comprehensions to iterate over local lists). Similarly, the output can either be marked aslocal_or not, and that fixes where the allocation is to happen.For array comprehensions: We only have global arrays, and do not currently allow there to be such a thing as a local array at all.
This fully determines the mode situation. Thus, until we loosen either of these restrictions, we do not pass modes to other functions for typechecking comprehensions.
Outstanding issues
As mentioned at the start, there are still some outstanding issues that need to be resolved (in addition to anything found during review), which I’ve marked with
CR aspectorzabuskycomments. For full detail, see said comments in the code; however, the four major categories are as follows, in no particular order.Mostly, these comments are in a variety of places where I’m simply not sure about the exact right thing to do with the type system (e.g., exactly where I should call
instance) or otherwise have small coding uncertainties. Similarly, there are some places where I make style choices that I’d like someone to vet for idiomaticity.There are a couple places I’m uncertain about modes/local allocations.
I’m generally not sure what I’m doing with attributes, and I don’t think I’m threading them through right; there are a number of comments to this effect.
functions from [CamlinternalComprehension]; as regular OCaml functions,
they cannot cannot have the desired (or any) mode polymorphism.
The design of
parsing/extensions.{mli,ml}was done before we had other language extensions, and I’m not sure what to do with it; this is mentioned in aCR aspectorzabuskycomment at the start of the file, but the longest discussion is above in the section “One other issue with representing language extensions in the AST”, which that comment points to.Appendix: Related work and design considerations
List (and other) comprehensions exist in various languages such as Haskell, Python, Julia, and Erlang (as well as many others); our syntax is most similar to Python’s. Other languages have related syntax, such as Scala’s
forcomprehensions or C#’s LINQ (Language INtegrated Query) notation.Where should the comprehension body go?
Both Scala and C#’s comprehension-like notations place the body of the comprehension analog at the end; for instance, the first example we present (“Pythagorean triples with components from 1 to 10, no duplicate triples”) would be written as follows in Scala 3:
This highlights something that’s a bit unusual about comprehensions in OCaml: comprehension iterators are the only binding form where the variable bindings are placed after the place they are in scope. (This is not so unusual for Haskell, where
whereclauses exist as a means to do this in normal code.) However, we decided that having the body on the left ultimately felt more natural in many cases, especially considering mathematical set-builder notation, and fit in well with the rest of OCaml. It also allowed us to mirror similar languages (e.g., Haskell and Erlang) as well as Python, which – while not terribly similar to OCaml – is a language we often find ourselves using alongside OCaml, and one that is tremendously widely written in general.Variable binding and iteration order
The other nice thing about these “front-to-back”–style comprehension-like notations is that they have the pleasing property that the scope of variables is easy to describe: variables are bound everywhere following their binding in the comprehension, and this means that they are in scope for lexically contiguous regions. For instance, when we annotate the scopes in the above Scala
forcomprehension, we get a series of nice rectangles:In standard comprehensions like ours (as well as Haskell’s, Python’s, Julia’s, Erlang’s, etc.), on the other hand, scopes are discontiguous:
While variable scopes span lexically contiguous regions of clauses, they also include the body, which is elsewhere, back off to the left.
Consequently, we experimented with a design where clauses went from right to left, instead: we would have written the above comprehension as
This restores lexically contiguous scopes:
However, this experiment ultimately didn’t work out, for two main reasons. Firstly, these right-to-left comprehensions simply felt unergonomic to write. Adding new clauses involved typing some left-to-right English/code, and then moving the cursor backwards to fill in the next clause. Secondly, the iteration order felt confusing; this design would have necessitated that iteration also proceeded from right to left, which is unlike all the rest of the control flow in OCaml and is broadly unusual. We tried having
for ... and ...clauses iterate from left to right, to provide a left-to-right option that didn’t interact with binding, but this ultimately felt more confusing when everything else went right to left, and didn’t help with truly dependent examples such as the above Pythagorean triples. Moreover, every other language with standard comprehensions works the same way, both binding and iterating from left to right. Thus, for all these reasons, we went with the current, standard, left-to-right design.