Skip to content

Fix crash caused by polykinded instances#4325

Merged
rhendric merged 1 commit intopurescript:masterfrom
rhendric:rhendric/fix-4180
May 15, 2022
Merged

Fix crash caused by polykinded instances#4325
rhendric merged 1 commit intopurescript:masterfrom
rhendric:rhendric/fix-4180

Conversation

@rhendric
Copy link
Copy Markdown
Member

Description of the change

Fixes #4180—gets rid of the crash, supports polykinded instances.

I grabbed this bug to work on partly because it seemed like a self-contained, tractable issue that would teach me something about the type checker, about which I still feel woefully ignorant.

I figured it would improve the learning experience for me and/or for others if I documented the process, so here's my bug-fix diary. Highly optional reading; the PR is small and this journal contains way more than is needed to understand it.

I'm writing this diary of my progress so that I can include it with the eventual fix PR (optimistic of me!) and, stream-of-consciousness though it may be, share my educational experience with others, so that those who know more than me can correct me on any misconceptions I develop, and those who know less than me can absorb some of this all-too-scarce knowledge. Apologies for any errors; this is going to be largely unedited.

First step is to attempt to further minimize the failing case. Liam has already done most of my work for me, but with a little trial and error I discovered three simplifications:

  • Class C doesn't need to be polykinded; annotating its type parameter with kind Type does the trick
  • The instance doesn't need to be constrained
  • The annotation doesn't need to be polymorphic; a monotyped constraint like C (Array String) is enough

I also guess that separating out the Int-typed declaration from the complicated annotation might save me some confusion later; this didn't interfere with the reproduction either, so that leaves me with:

module Main where

import Effect.Console (log)

class C (t :: Type)
instance C (f a)

f :: C (Array String) => Int
f = 0

v :: Int
v = f

main = log "Done"

as my test case.

The error reported is attached to the definition of v, not the definition of f; this suggests to me that it's the part of type checking that solves constraints that I'll need to fix up, not the part that creates constrained terms like f.

Now, into the stack trace. The proximal site of the error is lookupUnsolved; it appears that unknown number 9 is not an element in the map uns there. I'll add a little trace above the internalCompilerError:

trace ("Current uns:\n" <> show (M.map (fmap void) uns)) $

(void is my quick-and-dirty debugging trick for printing types without all the source span noise. I know the debugType etc. functions exist, but I'd rather not deal with strings.)

Sure enough, the uns map only contains entries for 21, 22, and 23. I wonder where 9 got off to?

I actually don't have any idea how the substUnsolved map gets manipulated over the course of type checking; let me grep the codebase for that and find out.

The only hits where substUnsolved is modified are:

  • in TypeChecker.Kinds.addUnsolved, where an entry is inserted
  • in TypeChecker.Unify.freshType[withKind], where entries are also inserted

Add traces to those locations, make sure 9 is inserted at some point.

It is.

Given that the only mutations to substUnsolved that I can find are inserts, there's presumably some other sort of state manipulation happening—maybe some part of the process is holding a copy of the state and restores that copy when a subtree is exited or something.

Pause that investigation for now; I might not need to figure it out exactly how it works. Instead, assume that unknown number 9 incorrectly ended up in a type somewhere, and either that type should not have survived whatever state restoring happened, or the unknown should have been replaced with something else.

Back to the call stack. The top of the stack is a call to unifyKinds' from deep within Entailment. Are the kinds being unified at all interesting? Insert a trace.

(TUnknown () 9,TUnknown () 21)

So unknown 9 is being unified with unknown 21. Not as enlightening as I'd hoped. Can the surrounding code in Entailment tell me anything useful? unifyKinds' is called inside withFreshTypes, a helper function used only once deep in entails. Add the following messy traces above that call to withFreshTypes:

                env' <- lift $ lift getEnv
                trace (unlines . filter ("Main" `isInfixOf`) $ debugEnv env') $ pure ()
                trace ("---") $ pure ()
                traceShow (M.map void subst) $ pure ()
                trace ("---") $ pure ()
                traceShow tcd $ pure ()

(The debugEnv output is huge without filtering it down to entries created in the test's Main module, which are most likely to be relevant.)

The show of a TypeClassDictionaryInScope is hideous, and it isn't functory so I can't use my void trick on it, and neither does it seem to have a debug printer. But by greping for the number 9, I've identified that it does contain a TUnknown with that number in the tcdForAll and tcdDescription fields, so let me refine my traces:

                env' <- lift $ lift getEnv
                trace (unlines . filter ("Main" `isInfixOf`) $ debugEnv env') $ pure ()
                trace ("---") $ pure ()
                traceShow (M.map void subst) $ pure ()
                trace ("---") $ pure ()
                traceShow (map (fmap void) $ tcdForAll tcd) $ pure ()
                traceShow (fmap void $ tcdDescription tcd) $ pure ()

And behold:

[("a",TUnknown () 9),("f",TypeApp () (TypeApp () (TypeConstructor () (Qualified (Just (ModuleName "Prim")) (ProperName {runProperName = "Function"}))) (TUnknown () 9)) (TypeConstructor () (Qualified (Just (ModuleName "Prim")) (ProperName {runProperName = "Type"}))))]
Just (ForAll () "a" (Just (TUnknown () 9)) (ForAll () "f" (Just (TypeApp () (TypeApp () (TypeConstructor () (Qualified (Just (ModuleName "Prim")) (ProperName {runProperName = "Function"}))) (TUnknown () 9)) (TypeConstructor () (Qualified (Just (ModuleName "Prim")) (ProperName {runProperName = "Type"}))))) (TypeApp () (TypeConstructor () (Qualified (Just (ModuleName "Main")) (ProperName {runProperName = "C"}))) (TypeApp () (TypeVar () "f") (TypeVar () "a"))) Nothing) Nothing)

So the tcdForAll of this instance, the list of type parameters it accepts, is [(a :: ?9), (f :: ?9 -> Type)], and the tcdDescription of this instance (I'll need to figure out what that even means; it looks like it's just the instance's type) is a forall (a :: ?9). forall (f :: ?9 -> Type). C (f a).

Looking at the instance... well, that actually makes sense. The instance doesn't contain any hints about the kind of a, and since we adopted polykinds, there's no reason to assume it's Type. I'd guess that, with unknown number 9 no longer in scope, this type should have been generalized to something like what the debugEnv part of the trace tells me is the type of Main.$c0:

Main.$c0 :: forall (t14 :: Type) (a :: t14) (f :: t14 -> Type). C$Dict (f a)

It's at this point that I went and commented on the original issue; I figured it's possible we don't want to support kind generalization here.

Coming back to this, initial comments from Nate suggest that I should move forward with fixing kind generalization for this instance instead of rejecting it. So I need to figure out where these types come from and why the unknown hasn't been rewritten.

Popping into the definition of TypeClassDictionaryInScope to look at the documentation and ordering of the fields...

    -- | Quantification of type variables in the instance head and dependencies
    , tcdForAll :: [(Text, SourceType)]

Yup, checks out.

    -- | If this instance was unnamed, the type to use when describing it in
    -- error messages
    , tcdDescription :: Maybe SourceType

Oh... oh no.

% git blame -L 36,38 src/Language/PureScript/TypeClassDictionaries.hs
7f4aea4fb lib/purescript-cst/src/Language/PureScript/TypeClassDictionaries.hs (Ryan Hendrickson 2021-06-26 12:36:30 -0400 36)     -- | If this instance was unnamed, the type to use when describing it in
7f4aea4fb lib/purescript-cst/src/Language/PureScript/TypeClassDictionaries.hs (Ryan Hendrickson 2021-06-26 12:36:30 -0400 37)     -- error messages
7f4aea4fb lib/purescript-cst/src/Language/PureScript/TypeClassDictionaries.hs (Ryan Hendrickson 2021-06-26 12:36:30 -0400 38)     , tcdDescription :: Maybe SourceType

Never let it be said that I'm unwilling to humiliate myself in the name of education.

Well, maybe 2021-me should have picked a better name for that field, although now that I know the answer it's hard for me to think of an improvement. Regardless, now I know what these fields are meant to be. I now need to figure out where their values come from.

TypeClassDictionaryInScopes get created in a bunch of places, but most of them are in Entailment solving special-case classes like Union and stuff. The one place general enough to be worth looking at is in TypeChecker.typeCheckAll, when an instance declaration is encountered.

I'm going to set a trap just above that location that throws an error if there are any unknowns in tcdForAll, just to confirm that this is where I should be looking:

          when (any (containsUnknowns . snd) vars) $ internalError "typeCheckAll: vars contains unknowns"

Sure enough, my test case triggers it. Also, the tcdDescription value used in that location depends on vars, so I just have to make sure that however vars is computed, all of the unknowns are correctly handled first.

vars comes straight out of checkInstanceDeclaration, so let's look for that. It's a function in TypeChecker.Kinds. Okay! There's a lot of logic here that looks like it's responsible for replacing unknowns with type variables! Seems like I'm getting close, so I attempt to understand this function:

  • First, it constructs a type from the given class name, args, and constraints
  • It collects the free variables from that type, since there are no forall constructs for instances (yet)
  • It creates fresh kinds for each of the free variables (these will be unknowns!) and binds them—meaning creates new entries in the environment identifying these variable names as local type variables with the given kind
  • It ensures that the type constructed in the first step (sans its constraints) has kind Constraint
  • It checks the instance contraints
  • It constructs a new version of the type constructed in the first step, with the checked instance constraints
  • From this type, it collects a list of all unknowns used (so, is this list complete?)
  • It creates new type variables for these unknowns and replaces the unknowns with those variables
  • Finally, it disassembles this last type into constraints, kinds, and args, and then determines the kinds of all the args by prepending the new type variables' kinds to the original list of free variables' kinds and mass replacing unknowns on all of those (does this step miss something?)

I have two open questions where this code might be failing, so time to drop in a trace between those points and see what we're getting:

    trace ("allUnknowns = " <> show (map (fmap void) allUnknowns)) $ pure ()

allUnknowns is... empty? Ah, of course; the unknowns aren't in the type. They're in the environment, as the kinds of the type vars that do occur in the type. They only get returned in varKinds because they're also in the freeVarsDict that gets used to construct the final list of kinds.

So what this function should be doing is, when it collects unknowns after it does the type checking of the class type and constraints, also collect any unsolved unknowns from freeVarsDict.

In order to determine if unknowns are solved, I need to gets something from the checker state. substUnsolved . checkSubstitution looks like the right thing. I'll use any unknowns from freeVarsDict that are contained in that map.

    uns <- gets (substUnsolved . checkSubstitution)
    allUnknowns <- unknownsWithKinds . IS.toList $ unknowns allTy <> IS.filter (`M.member` uns) (foldMap (unknowns . snd) freeVarsDict)

... Huh. That didn't help; the trap I set for vars containing unknowns is still triggering. My allUnknowns trace is telling me that I'm now adding variables for unknowns number 5 and 6, leaving unknown number 9 still in the wind. So unknown number 9 wasn't in the freeVarsDict after all.

Maybe instead of just using the unknowns in freeVarsDict, I need to run them through some substitution that will replace these unknowns with other stuff? Well... there's this substitution thing right here...

    subst <- gets checkSubstitution
    let uns = substUnsolved subst
    allUnknowns <- unknownsWithKinds . IS.toList $ unknowns allTy <> IS.filter (`M.member` uns) (foldMap (unknowns . substituteType subst . snd) freeVarsDict)

Praise the Omnissiah! That does it. Time to rip out all the debugging traces and traps and run the full test suite.

Seems to be safe; so: open up CHANGELOG.d/fix_4180.md. From the perspective of a user, what have I done here? I learned that this bug was caused by kind polymorphism in instances, so:

* Fix crash caused by polykinded instances

‘Polykinded instances’ may not mean much to everyone reading the changelog, so I'll add some notes about what that means, and what specific sort of instance triggered the crash.

Now it's time to commit, submit, and wait for feedback!


Checklist:

  • Added a file to CHANGELOG.d for this PR (see CHANGELOG.d/README.md)
  • Added myself to CONTRIBUTORS.md (if this is my first contribution)
  • Linked any existing issues or proposals that this pull request should close
  • Updated or added relevant documentation
  • Added a test for the contribution (if applicable)

allUnknowns <- unknownsWithKinds . IS.toList $ unknowns allTy
subst <- gets checkSubstitution
let uns = substUnsolved subst
allUnknowns <- unknownsWithKinds . IS.toList $ unknowns allTy <> IS.filter (`M.member` uns) (foldMap (unknowns . substituteType subst . snd) freeVarsDict)
Copy link
Copy Markdown
Contributor

@natefaubion natefaubion May 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unknownsWithKinds walks through the kinds of unknowns as well. The way this is supposed to work is:

  • unknowns gives you all the unknowns that are actually present in the type.
  • unknownsWithKinds recursively walks the kinds of these unknowns in the environment to find additional unknowns which need to be generalized.

So I wouldn't think this would be necessary, unless unknownsWithKinds isn't behaving as expected.

Copy link
Copy Markdown
Contributor

@natefaubion natefaubion May 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See this line which dereferences the unknown's kind, applys it, and recursively walks it's unknowns

(lvl, ty) <- traverse apply =<< lookupUnsolved u
rest <- fmap join . traverse go . IS.toList . unknowns $ ty

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trouble with that is, unknowns allTy is empty in this case, so the recursion in unknownsWithKinds has nothing to start with.

allTy contains free type variables but no unknowns; the unknowns have been placed into the environment as the kinds of the free type variables, but unknowns doesn't use the environment; it just looks for unknowns that are used directly in the type.

So we need some step that gets the kinds of the free type variables. We also can't just pass those kinds to unknowns and then unknownsWithKinds, because lookupUnsolved clearly expects its input to still appear in the substUnsolved map, and by this point in the checking, those unknowns have been solved (but not fully; they've been substituted with different unknowns).

Any thoughts on how the above should happen?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trouble with that is, unknowns allTy is empty in this case, so the recursion in unknownsWithKinds has nothing to start with.

Ah! Yeah, of course. This is what I was missing :) With that in mind, my thought would be that using unknowns allTy is just wrong. Would there ever be any unknowns in allTy (that weren't the result of the unknown kinds of the free type variables)? If that were the case, I might do:

  • Traverse over the kinds in freeVarsDict and apply them.
  • Get all the unknowns for these kinds.
  • Get unknownsWithKinds for these unknowns.

This is basically what you are doing, you are just manually calling substituteType rather than apply, and adding the additional unknowns allTy, which I'm not sure is necessary.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose maybe it's possible to have unknowns in allTy if a class was generalized with an unused/redundant type variable 😬.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I ripped it out and ran the test suite, which found some such cases. Here's one:

newtype Indexedforall k1 k2 k3. (k1  Type)  k2  k3  k1  Type
newtype Indexed m x y a = Indexed (m a)

class IxFunctor ix. (ix  ix  Type  Type)  Constraint
class IxFunctor f where
  imap   a b x y. (a  b)  f x y a  f x y b

instance ixFunctorIndexedFunctor m  IxFunctor (Indexed m) where
  imap f (Indexed ma) = Indexed (map f ma)

Here, the unknown shows up in the KindApps in allTy, but not as the kind of any free variable.

Copy link
Copy Markdown
Contributor

@natefaubion natefaubion May 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a particular reason for the IS.filter (`M.member` uns) that makes it necessary?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be a vestige of an earlier iteration. unknownsWithKinds (lookupUnsolved) crashes if it receives a solved unknown. Is the result of substituteType/apply guaranteed not to contain any? That would make sense, yeah.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it recursively substitutes solved unknowns.

@rhendric rhendric force-pushed the rhendric/fix-4180 branch from 03c4cf2 to 4fc6c78 Compare May 15, 2022 18:35
@rhendric rhendric merged commit 3d7c22f into purescript:master May 15, 2022
@rhendric rhendric deleted the rhendric/fix-4180 branch May 15, 2022 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compiler error with kind unification

2 participants