Skip to content

Very WIP: Mutually recursive types#32581

Closed
Keno wants to merge 4 commits intomasterfrom
kf/mutrecurtypes
Closed

Very WIP: Mutually recursive types#32581
Keno wants to merge 4 commits intomasterfrom
kf/mutrecurtypes

Conversation

@Keno
Copy link
Copy Markdown
Member

@Keno Keno commented Jul 15, 2019

This is a very WIP start at #269. In fact it's so WIP that it doesn't build and is probably not worth looking at the code in its current form (http://wiki.c2.com/?PlanToThrowOneAway).

Here's how things currently work on this PR:

incomplete type Bar end
struct Foo; b::Bar; end
struct Bar; b::Foo; end

this declares two mutually recursive structs Foo and Bar, with an explicit forward declaration for Bar. At the moment this is being accomplished by having Bar be an incomplete DataType that then gets updated in place when it is completed. However, I suspect this will need to be changed to a more general mechanism with pointers being replaced after the fact, since in general we don't necessarily want to assume that Bar will be a DataType (e.g. as in the Union example in #269).

We also keep track of the completeness or non-completness of any given data type:

julia> incomplete type Bar end

julia> Bar.incomplete
0x1

julia> struct Foo; b::Bar; end

julia> Foo.incomplete
0x2

julia> struct Bar; b::Foo; end

julia> Bar.incomplete
0x0

julia> Foo.incomplete
0x0

(0x0 indicates complete). All types in a strongly connected component of the dependency graph are finalized together when the last dependency of the SCC is resolved.

Additionally, incomplete types are invalid in subtyping, so method definitions are delayed until
the relevant types are completed:

julia> incomplete type Bar end

julia> f(x::Bar) = 1
f (generic function with 0 methods)

julia> struct Bar; end

julia> f
f (generic function with 1 method)

I'm planning to require all types to be completed by the end of their defining module, though I haven't implemented that yet, i.e. we'd have:

julia> module X; incomplete type Foo end; end
ERROR: Type `Foo` remained incomplete at the end of the module

Additionally, there's of course the question of automatically inserting forward definitions. I think that's more of a UX tradeoff than a technical one. I see basically three possibilities.

  1. Insert forward declarations automatically for any undefined identifier directly attached to a declaration, i.e.
struct Foo
x::NotDefinedYet
end

would be semantically equivalent to

# Potentially with some sort of weak binding that allows redefinition, and forbids reference
# outside a struct definitions
incomplete type NotDefinedYet end 
struct Foo
x::NotDefinedYet
end

though of course this would be done in the runtime, not during lowering, since we don't know what identifiers have been defined already during lowering.

This'd probably be the most convenient, but I'm slightly worried about causing user confusion as to when the RHS of a struct definition is evaluated, particularly wrt evaluated definitions:

struct Foo
x::identity(NotDefinedYet) # Error: NotDefinedYet not defined
end

I'm also a bit worried about user confusion around typos, e.g.

struct Foo
x::Int31
end

# Foo is now incomplete, with no obvious error particularly at the REPL where the module never ends

That could probably be addressed with appropriate error messages and warnings, but I think it's a not-insignificant concern.

  1. Insert forward declarations in lowering for blocks that are lowered together.

This is basically what was proposed in #269. Note that I don't actually think this lets us avoid any technical difficulties over the other proposal, but does of course have a more limited scope for the automatic forward declarations so avoids some of the UX difficulties of the previous option. To recap, the proposal would be that:

begin
    struct Foo; x::Bar; end
    struct Bar; x::Foo; end
end

would lower to

begin
    incomplete type Bar end
    struct Foo; x::Bar; end
    struct Bar; x::Foo; end
end

I think the biggest problem with this is that it makes the extent of the lowering scope semantically significant at top level, far more than anything else in the language. I.e. you could no longer safely paste the inside of that begin/end block into the REPL (particularly if we allow it without begin/end in an enclosing module or at the file level).

  1. Do nothing, require explicit forward declarations

Simple, but perhaps not satisfying.

Anyway, comments welcome. cc @StefanKarpinski @JeffBezanson

Keno added 4 commits July 20, 2019 19:23
When modifying Julia's builtin types, it's really easy to make mistakes
here. Now the compiler will check for you.
@Keno Keno force-pushed the kf/mutrecurtypes branch from 885ed90 to 2dda80d Compare July 22, 2019 17:19
@Keno
Copy link
Copy Markdown
Member Author

Keno commented Jul 22, 2019

I've started over on this. Some changes:

  1. incomplete type is gone, since the behavior you want is more a placeholder. You don't necessarily want it to be a DataType (e.g. you might want it to be a Union). Instead, there is an @placeholder macro.
  2. It actually builds now.

The method definition deferment behavior is still there, which @JeffBezanson was suspicious of, so that may be something that needs to be resolved soon.

Keno added a commit that referenced this pull request Jul 23, 2019
This is an alternative to #32581, that is easier to implement, but
more restrictivive. Here, the forward declaration must contain
everything except for the field types:
```
incomplete type Foo{Bar} <: Baz; end
```
with it being an error to deviate from this specification when
completing the type
```
struct Foo <: Baz; end # Error: Missing type parameter
struct Foo{A, B} <: Baz; end # Error: Too many type parameters
struct Foo{Qux} <: Baz; end # Error: Name of type parameter doesn't match
struct Foo{Bar<:Int64} <: Baz; end # Error: Bounds of type parameter don't match
struct Foo{Bar}; end; # Error supertype dosesn't match
```
This is easier because this way we have enough information for subtyping
and type application and therefor do not need to delay this until
the entire dependency graph is complete as we did in #32581.
Of course this also means that we don't get the union feature that
was requested in #269:
```
inomplete type U; end
struct A; x::U; end
struct B; x::U; end
U = Union{A, B}; #ERROR
```
However, it could of course be emulated by wrapping the union type:
```
struct U
   data::Union{A, B}
end
```

However, given the simplicity of this change and the difficulty of #32581,
this seems like the way to go for the moment. We may want to revisit all
this if we ever want to do computed field types, which will encounter similar
challenges as #32581.

Fixes #269.
Keno added a commit that referenced this pull request Jul 24, 2019
This is an alternative to #32581, that is easier to implement, but
more restrictivive. Here, the forward declaration must contain
everything except for the field types:
```
incomplete type Foo{Bar} <: Baz; end
```
with it being an error to deviate from this specification when
completing the type
```
struct Foo <: Baz; end # Error: Missing type parameter
struct Foo{A, B} <: Baz; end # Error: Too many type parameters
struct Foo{Qux} <: Baz; end # Error: Name of type parameter doesn't match
struct Foo{Bar<:Int64} <: Baz; end # Error: Bounds of type parameter don't match
struct Foo{Bar}; end; # Error supertype dosesn't match
```
This is easier because this way we have enough information for subtyping
and type application and therefor do not need to delay this until
the entire dependency graph is complete as we did in #32581.
Of course this also means that we don't get the union feature that
was requested in #269:
```
inomplete type U; end
struct A; x::U; end
struct B; x::U; end
U = Union{A, B}; #ERROR
```
However, it could of course be emulated by wrapping the union type:
```
struct U
   data::Union{A, B}
end
```

However, given the simplicity of this change and the difficulty of #32581,
this seems like the way to go for the moment. We may want to revisit all
this if we ever want to do computed field types, which will encounter similar
challenges as #32581.

Fixes #269.
Keno added a commit that referenced this pull request Jul 26, 2019
This is an alternative to #32581, that is easier to implement, but
more restrictivive. Here, the forward declaration must contain
everything except for the field types:
```
incomplete type Foo{Bar} <: Baz; end
```
with it being an error to deviate from this specification when
completing the type
```
struct Foo <: Baz; end # Error: Missing type parameter
struct Foo{A, B} <: Baz; end # Error: Too many type parameters
struct Foo{Qux} <: Baz; end # Error: Name of type parameter doesn't match
struct Foo{Bar<:Int64} <: Baz; end # Error: Bounds of type parameter don't match
struct Foo{Bar}; end; # Error supertype dosesn't match
```
This is easier because this way we have enough information for subtyping
and type application and therefor do not need to delay this until
the entire dependency graph is complete as we did in #32581.
Of course this also means that we don't get the union feature that
was requested in #269:
```
inomplete type U; end
struct A; x::U; end
struct B; x::U; end
U = Union{A, B}; #ERROR
```
However, it could of course be emulated by wrapping the union type:
```
struct U
   data::Union{A, B}
end
```

However, given the simplicity of this change and the difficulty of #32581,
this seems like the way to go for the moment. We may want to revisit all
this if we ever want to do computed field types, which will encounter similar
challenges as #32581.

Fixes #269.
Keno added a commit that referenced this pull request Jul 26, 2019
This is an alternative to #32581, that is easier to implement, but
more restrictivive. Here, the forward declaration must contain
everything except for the field types:
```
incomplete type Foo{Bar} <: Baz; end
```
with it being an error to deviate from this specification when
completing the type
```
struct Foo <: Baz; end # Error: Missing type parameter
struct Foo{A, B} <: Baz; end # Error: Too many type parameters
struct Foo{Qux} <: Baz; end # Error: Name of type parameter doesn't match
struct Foo{Bar<:Int64} <: Baz; end # Error: Bounds of type parameter don't match
struct Foo{Bar}; end; # Error supertype dosesn't match
```
This is easier because this way we have enough information for subtyping
and type application and therefor do not need to delay this until
the entire dependency graph is complete as we did in #32581.
Of course this also means that we don't get the union feature that
was requested in #269:
```
inomplete type U; end
struct A; x::U; end
struct B; x::U; end
U = Union{A, B}; #ERROR
```
However, it could of course be emulated by wrapping the union type:
```
struct U
   data::Union{A, B}
end
```

However, given the simplicity of this change and the difficulty of #32581,
this seems like the way to go for the moment. We may want to revisit all
this if we ever want to do computed field types, which will encounter similar
challenges as #32581.

Fixes #269.
@Keno
Copy link
Copy Markdown
Member Author

Keno commented Aug 2, 2019

Abandoning in favor of #32658

@Keno Keno closed this Aug 2, 2019
Keno added a commit that referenced this pull request Aug 2, 2019
This is an alternative to #32581, that is easier to implement, but
more restrictivive. Here, the forward declaration must contain
everything except for the field types:
```
incomplete type Foo{Bar} <: Baz; end
```
with it being an error to deviate from this specification when
completing the type
```
struct Foo <: Baz; end # Error: Missing type parameter
struct Foo{A, B} <: Baz; end # Error: Too many type parameters
struct Foo{Qux} <: Baz; end # Error: Name of type parameter doesn't match
struct Foo{Bar<:Int64} <: Baz; end # Error: Bounds of type parameter don't match
struct Foo{Bar}; end; # Error supertype dosesn't match
```
This is easier because this way we have enough information for subtyping
and type application and therefor do not need to delay this until
the entire dependency graph is complete as we did in #32581.
Of course this also means that we don't get the union feature that
was requested in #269:
```
inomplete type U; end
struct A; x::U; end
struct B; x::U; end
U = Union{A, B}; #ERROR
```
However, it could of course be emulated by wrapping the union type:
```
struct U
   data::Union{A, B}
end
```

However, given the simplicity of this change and the difficulty of #32581,
this seems like the way to go for the moment. We may want to revisit all
this if we ever want to do computed field types, which will encounter similar
challenges as #32581.

Fixes #269.
@DilumAluthge DilumAluthge deleted the kf/mutrecurtypes branch March 25, 2021 21:58
Keno added a commit that referenced this pull request Jan 6, 2026
…ypes

This PR introduces syntax for `recursive type`s. This issue has
some history. The original issue for this is #269 and there were
previous attempts in #32581 and #32658.

The fundamental construct uses a `recursive type` block:

```julia
recursive type Edge
    struct Node
        edges::Vector{Edge}
    end

    struct Edge
        from::Node
        to::Node
    end
end
```

The name after `recursive type` is bound to an incomplete struct
and can be referenced by other type definitions before completion.
However, the actual types are created and bound to the global
binding table atomically at the end of the `recursive type` block.

Multiple forward declarations are supported by tuple syntax:
```
recursive type (Foo, Bar, Baz)
   struct Foo
     x::Union{Foo, Bar, Baz, Nothing}
   end
   struct Bar
     x::Foo
   end
   struct Baz
     x::Foo
   end
end
```
The return type of the whole expression is determined by the term
after the `recursive type` keyword (e.g. the above returns `(Foo, Bar,
Baz)`.

There is a separate hierarchy of `IncompleteStruct`,
`IncompleteUnion{All}`, `IncompleteTypeApp`. Inside
a `recursive type` block, toplevel struct definitions,
type applications, etc. create these incomplete structs
instead. This is close in design to #32581, but with a fully
separate type hierarchy rather than a `Placeholder` type and
re-using the existing structs. The motivation here is to minimize
the surface area that incomplete types can flow to. Assignments
to the global bindings are syntactically deferred until the end
of the block.

Inner constructors are a bit of a special case. They are syntactically
analyzed at definition time (e.g. for the number of arguments to `new`),
but their definition is deferred until after all the types have been
defined. The idea here is to allow inner constructors to depend on
types that would otherwise have been incomplete at definition time.

Non-inner-constructor method definitions are syntactically disallowed inside
`recursive type` (but can be defined outside the block of course).

- As discussed, method definitions are disallowed inside `recursive
  type`
- Recursion is semantically (but not syntactically) disallowed in
  supertypes and unions.

Fixes #269

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant