PrairieLearn icon indicating copy to clipboard operation
PrairieLearn copied to clipboard

Feature request: pl-order-blocks with multiple correct solutions

Open jeffgerickson opened this issue 4 years ago • 18 comments

We are using pl-order-blocks for proof questions in CS 374. For the simple results we are asking students to prove, there is more than one correct proof.

For example, in an induction proof that reversing a string preserves its length, the base case might be proved like this:

  • |w^R| = |ε^R| because we've assumed w=ε
  • |ε^R| = |ε| by definition of reversal
  • |ε| = |w| because we've assumed w=ε

But it could also be proved like this:

  • |w| = |ε| because we've assumed w=ε
  • |ε| = |ε^R| by definition of reversal
  • |ε^R| = |w^R| because we've assumed w=ε

These two sequences are almost but not quite reverses of each other.

I would like to present the students with a pl-order-blocks that includes all six statements, as well as a couple distractors, and ask them to assemble a correct proof, with either of the two correct three-line proofs getting full credit.

More generally, it would be great if pl-order-blocks supported multiple proofs that are not just different linear extensions of the same proof dag. Reversing the derivation order of the whole proof is not the only variant I can imagine; this is just one case in a larger case analysis. As another example, the same statement might be proved as $a \to b \to c \to z$ or $a \to d \to e \to z$.

I'm afraid I don't have any good suggestions for how to specify multiple proofs.

jeffgerickson avatar Aug 17 '21 19:08 jeffgerickson

Maybe we could have multiple <pl-alternative> (newly invented name) blocks inside of a <pl-order-blocks>, where each pl-alternative would then contain a separate proof?

Probably @SethPoulsen has better ideas than me!

mwest1066 avatar Aug 17 '21 20:08 mwest1066

I'll have to think about this some more. There must be an extension of proof-dags to represent alternatives, similar to the casework structure that pl-order-blocks already supports.

Currently <pl-block-group> indicates one of several cases, each of which must be fully traversed before considering another <pl-block-group> or <pl-answer>. But perhaps the traversal rule should be made more explicit. Maybe something like the following?

<pl-block-cases name="cases">
    <pl-block-group name="case1">
        <pl-answer name="case1step1">...
        <pl-answer name="case1step2" depends="case1step1">...
        <pl-answer name="case1step3" depends="case1step1">...
        <pl-answer name="case1step4" depends="case1step2,case1step3">...
    </pl-block-group>
    <pl-block-group name="case2">
        <pl-answer name="case2step1">...
        <pl-answer name="case2step2" depends="case2step1">...
    </pl-block-group>
</pl-block-cases>
<pl-block-alternatives name="alts" depends="cases">
    <pl-block-group name="alt1">
        <pl-answer name="alt1step1">...
        <pl-answer name="alt1step2" depends="alt1step1">...
        <pl-answer name="alt1step3" depends="alt1step1">...
        <pl-answer name="alt1step4" depends="alt1step2,alt1step3">...
    </pl-block-group>
    <pl-block-group name="alt2">
        <pl-answer name="alt2step1">...
        <pl-answer name="alt2step2" depends="alt2step1">...
    </pl-block-group>
</pl-block-alternatives>

Here <pl-block-cases> is a list of <pl-block-group>s all of which must be traversed, and <pl-block-alterantives> is a list of <pl-block-group>s one of which must be traversed. And then each <pl-block-group> is a list containing any mixture of <pl-answer> and <pl-block-cases> and <pl-block-alterantives> with ordering dependencies. (The current implementation doesn't support nested cases. Boo.)

Obviously the syntax would have to be different to be backward-compatible with the exiting implementation. And it's far from obvious how to design a reasonable grading algorithm. And this could just be too complex to make pedagogically useful questions!!

jeffgerickson avatar Aug 17 '21 22:08 jeffgerickson

This is an awesome idea, and I agree it seems like there should be some extension of proof-dags we could come up with that would give us what we want.

Some other random thoughts I've got that we would need to sort out to make this work:

  • How would we grade it if the student did both proofs, one after another? what about both, but had lines intermixed in odd ways?
  • What if they did all of one proof and half the other? There are these weird cases where every line of the student proof would be a correct deduction, and they successfully proved the theorem, but I'm not sure if I want to grade it as correct because it might not be very pretty.
  • A few people have suggested having optional blocks in proofs, but we never really figured out a good way to make it work for students. Is there some cohesive way in which options of different proofs and optional blocks within proofs could both be supported?
  • What if there is a line that is in both of the proofs?

this could just be too complex to make pedagogically useful questions

This is actually one of the reasons I didn't implement nested cases yet, because I wasn't sure if there would be pedagogically useful questions of that complexity. Do you have some example nested cases questions?

As another example, the same statement might be proved as $a \to b \to c \to z$ or $a \to d \to e \to z$.

Ya, this is definitely another thing I think would be cool to support too, just haven't though enough about the details yet.

SethPoulsen avatar Aug 18 '21 00:08 SethPoulsen

@jeffgerickson I just had an alternative idea about how you could do your problem from above in the mean time.

How about making it so that none of the three intermediate inequalities depend on one another (they don't really, do they?), but then there is a final deduction step that depends on all three of the intermediate inequalities, like this:

<pl-answer tag="1" depends="">|w^R| = |ε^R| because we've assumed w=ε</pl-answer>
<pl-answer tag="2" depends="">|ε^R| = |ε| by definition of reversal</pl-answer>
<pl-answer tag="3" depends="">|ε| = |w| because we've assumed w=ε</pl-answer>
<pl-answer tag="4" depends="1,2,3">So when w = ε, we know that |w| = |w^R| </pl-answer>

That way the first 3 can actually just show up in any order and its fine.

SethPoulsen avatar Aug 19 '21 13:08 SethPoulsen

I've also thought up a couple more ideas about how to do optional blocks:

  • We could decouple the declaration of the proof structure from the declaration of the lines. That way, you could just write down the blocks you want to include, and then separately you could write down multiple DAG structures that you would accept as a correct answer
    • plus: extremely flexible, works for optional blocks, optional proof sections, alternate proofs, etc.
    • minus: might not be very readable
  • Introduce an "or" operator to the "depends" string, probably using a pipe, so you could write, e.g. depends="4 | 5" to mean "this line can be proved by 4 or 5, then if on of those lines isn't needed to prove anything else, it can be left out entirely.
    • plus: elegant syntax
    • minus: not as flexible, e.g. It wouldn't support having multiple options for the last line of the proof

What do people think about these ideas? Any other ideas about how to make it work well?

(edit: I still don't know how to solve some of the issues I mentioned above, particularly the one about students doing some of the optional lines, thus resulting in a really ugly but not wrong proof)

SethPoulsen avatar Aug 23 '21 15:08 SethPoulsen

On nested cases: I already have a problem in current written homework that requires handling two independent case distinctions: A string is either empty, a single symbol, or longer (reflecting the recursive definition of a function), and a string has either even length or odd length (reflecting a case distinction in the theorem to be proved). In a humna-graded homework, we can handle any of the following outlines:

Option 1:

  • case 1: w is empty
  • case 2: w is a single symbol
  • case 3: w is longer
  • subcase 3a: |w| is even
  • subcase 3b: |w| is odd

Option 2:

  • case 1: |w| is even
  • subcase 1a: w is empty
  • subcase 1b: w is non-empty
  • case 2: |w| is odd
  • subcase 2a: w is one symbol
  • subcase 2b: w is longer

Option 3:

  • case 1: w is empty
  • case 2: w is one symbol
  • case 3: w is long and has even length
  • case 4: w is long and has odd length

I'd rather guide students toward the first or second outlines (because the cases/subcases have different motivations) than the third.

jeffgerickson avatar Aug 27 '21 16:08 jeffgerickson

On independent equalities: The style I actually prefer for these proofs is a string of equalities like this:

Screen Shot 2021-08-27 at 11 20 39 AM

So the order of the statements does matter. I find this much more coherent/readable (and therefore more useful pedagogically) than a mere set of equations that collectively imply the theorem.

Obviously not every proof will have this form, but in my experience, different proofs of the same statement tend to vary far more by which intermediate statements they visit than by which order they traverse some fixed dag (except for independent cases).

jeffgerickson avatar Aug 27 '21 16:08 jeffgerickson

On partial credit for mixing different options: The grading standard I like to aim for is essentially "edit distance to a correct solution with no redundancies". So any extraneous statements (from submitting both proofs, or submitting all of one proof and half of the other, for example) should be flagged and penalized.

"What if there is a line that is in both of the proofs?" — Now that is a good question! Obviously we don't want to display both (identical) statements.

jeffgerickson avatar Aug 27 '21 16:08 jeffgerickson

"We could decouple the declaration of the proof structure from the declaration of the lines." — But variations are more fine grained than "proof structure".

Again, a current PL proof-blocks exercise requires to student to derive |a|+|x| = |ax| where a is a symbol and x is a string, as one step in a longer proof. You can do this either directly (invoking a previous lemma that concatenation adds length) or in two lines (|a|+|x| = 1+|x| = |ax|, both steps by the formal definition of length). So here I'd like to show all three blocks, and then reward either using block A or using blocks B and C.

jeffgerickson avatar Aug 27 '21 16:08 jeffgerickson

"We could decouple the declaration of the proof structure from the declaration of the lines." — But variations are more fine grained than "proof structure".

Sorry I wasn't more clear, I meant that you would first write down the proof lines the student would see, and then separately write down what arrangements of those lines constitute a correct proof. So that you could cover all cases (different lengths of proofs, different orderings, certain lines used and others not or the other way around, etc.). But I would have to think more carefully about the details of actually making this work.

SethPoulsen avatar Aug 27 '21 16:08 SethPoulsen

@jeffgerickson Thank you for all your feedback here. It would be great if we could get in front of a whiteboard together with @mwest1066 sometime and dream up a good generalized form of the proof-DAG that would help us cover all these cases in an elegant way.

SethPoulsen avatar Aug 27 '21 16:08 SethPoulsen

I'm starting to think that the flexibility I have in mind will make grading PSPACE-hard.

jeffgerickson avatar Aug 27 '21 17:08 jeffgerickson

I'm starting to think that the flexibility I have in mind will make grading PSPACE-hard.

It would actually be fascinating (and useful) to figure out which complexity class we land in depending on which of the proposed features we add or don't add.

SethPoulsen avatar Aug 27 '21 21:08 SethPoulsen

A brief example proof to show why it is really hard to decide how to grade optional lines. Suppose you want the students to prove the statement:

A ∧ B→ A ∨ B

One sensible way to prove this could be:

  • A ∧ B
  • A by definition of ∧
  • A ∨ B by definition of ∨

Another sensible proof might be:

  • A ∧ B
  • B by definition of ∧
  • A ∨ B by definition of ∨

So, suppose we make a Proof Blocks problem and give the students the following blocks to choose from:

  1. A ∧ B
  2. A by definition of ∧
  3. B by definition of ∧
  4. A ∨ B by definition of ∨

So that the intended solutions are 1, 2, 4 or 1, 3, 4. But what if the student puts 1,2,3,4? Is it wrong? What if they put 1,2,4,3? In both cases, they didn't make any unsound deductions, and the goal was successfully proven. Neither of these are proofs we want to encourage, but it doesn't seem fair to mark it wrong.

There will be other cases e.g. when defining new variable names, or making "without loss of generality" kinds of arguments, where including an alternative statement is actually wrong. When I think of a nice simple example for this I can include it here.

SethPoulsen avatar Sep 10 '21 20:09 SethPoulsen

Might this issue be related to the ideas about disappearing / reappearing blocks that have been discussed before? It may be worth sitting down and seeing if there's a way to rework the internal architecture to support all of the alternative grading / problem schemes that have been proposed across various issues.

Related to complexity, as long as the most general problem scheme isn't really wild, it's probably possible to use something like a commercial SAT solver (see this) obtained via an external dependency for grading. These are usually really fast, and they'll probably outperform whatever's currently being used even if we allow for more general types of proofs.

eliotwrobson avatar Dec 20 '22 21:12 eliotwrobson

So the main idea I have to generalize the grader is to go from a DAG to having a multi-graph with different possible block dependencies represented by different colored edges. Then to check if a solution is correct, you take the sub-DAG over different colored edges, then check against all topological sorts of all of those. This lets you do things like have ways of constructing proofs that use completely different sets of blocks. I'm not sure if you have ideas that need anything more expressive than that, it would be fun to discuss.

Currently, we don't have grading efficiency problems, thanks to the clever algorithm we came up with (see the paper if you want to see some how it works and some graphs of benchmarks: https://arxiv.org/abs/2204.04196), but if we do have efficiency problems when things get more complicated, I'm totally down to call out to Z3 if we need to, or write a custom solver ourselves, whatever is the best solution. But at this point we have been able to avoid needing to add another dependency and that's been good.

SethPoulsen avatar Dec 21 '22 03:12 SethPoulsen

I think that your idea of a multigraph of different colored edges (or similarly, a graph where each edge is colored with a subset) is really interesting and a good representation of multiple valid proofs (so a monochromatic path is a correct solution). I'm not sure if there's a way to be more expressive than this without making things too complex for a question writer to think of. Either way, it's probably worth looking over the existing issues and see whether this is general enough to capture those cases as well.

With respect to efficiency, I think you're right that this might not come up unless there are a ton of different colors in the solution. The size of a problem given in order blocks now is limited by the total number of blocks an instructor is willing to give to students, and with features like progressively revealing blocks for long proofs, the problem size could grow significantly. The algorithm you're using is nice in that it reduces to a standard NP-complete problem, so there should be plenty of standard tools out there that can help out if there are issues.

eliotwrobson avatar Dec 21 '22 04:12 eliotwrobson

I have a student who implemented a prototype of the mult-graph method for multiple solution paths.

Let me know if you would like to do Beta testing of it for us. If beta testing with a few users works well, we will get the changes ported to Master.

SethPoulsen avatar Oct 01 '24 17:10 SethPoulsen