Skip to content

LUCENE-10010 Introduce NFARunAutomaton to run NFA directly#225

Merged
zhaih merged 21 commits intoapache:mainfrom
zhaih:main
Dec 20, 2021
Merged

LUCENE-10010 Introduce NFARunAutomaton to run NFA directly#225
zhaih merged 21 commits intoapache:mainfrom
zhaih:main

Conversation

@zhaih
Copy link
Contributor

@zhaih zhaih commented Jul 26, 2021

Description

https://issues.apache.org/jira/browse/LUCENE-10010

Introduces NFARunAutomaton to run NFA directly

Works to to:

  1. Integrate with current RunAutomaton class hierarchy
  2. Further optimize the NFARunAutomaton implementation

Tests

A unit test that assert the NFARunAutomaton behaves the same as the DFA one by using random generated regex strings

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Lucene maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.

@zhaih zhaih force-pushed the main branch 2 times, most recently from 3f72684 to abe3ab1 Compare July 26, 2021 22:19
Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super exciting! I'm amazed how little code you needed to get this first version running.

I left some comments about how we might share some code with RunAutomaton instead of copying/forking those methods, but I don't think that's a blocker issue.

I love the randomized test - make a random RegExp, find random strings it accepts and confirm the NFARunAutomaton accepts them too. And then generate random (mostly rejected) strings and confirm those are rejected too. And change the order of those two, to exercise the lazy caching.

Another way to make random NFA would be to 1) enumerate random strings, 2) make DFA from those strings (using the efficient builder API Lucene provides), and 3) simply duplicating some states in that DFA creating an NFA, and then 4) confirm that the resulting NFA only accepts that same subset of strings.

Finally, I am really curious about performance. If we take a DFA as it is executed today as baseline, and compare to this new NFA path (running a DFA), what is the overhead?

if (stateSet.size() == 0) {
return null;
}
return new DState(stateSet.getArray());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't just use StateSet.freeze here?

@rmuir
Copy link
Member

rmuir commented Jul 28, 2021

This is super exciting! I'm amazed how little code you needed to get this first version running.

but a runautomaton for this won't run any queries on its own: brute force isn't how these queries actually work. the important part is the intersection (skipping around)...

I suggest, please let's not try to "overshare" and refactor all this stuff alongside DFA stuff until there is a query we can actually benchmark to see if the performance is even viable

@mikemccand
Copy link
Member

I suggest, please let's not try to "overshare" and refactor all this stuff alongside DFA stuff until there is a query we can actually benchmark to see if the performance is even viable

OK yeah +1 to keep it wholly separate (full fork) for now until we learn more how this NFARegexpQuery behaves.

Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks amazing! I think it is close!

I love how you were able to slightly abstract out the RunAutomaton methods so that NFARunAutomaton could be swapped in instead, and no change is needed to AutomatonQuery.

But I think checking for determinizeWorkLimit==0 way down super low inside determinize is a bit too low level ...

I'm super eager to see how performance compares for NFA execution method (determinize lazily on demand) for AutomatonQuery versus the current "determinize up front" DFA approach.

}
}
Automaton dfa = regExp.toDFA();
NFARunAutomaton candidate = new NFARunAutomaton(regExp.toNFA());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another maybe powerful way to test NFA behavior would be to create any random DFA (e.g. make random set of strings and call that Daciuk/Mihov builder, or random RegExp.toDFA() like here, or maybe even randomly construct something state by state and transition by transition), then create a new test-only method that converts any DFA back into an NFA by randomly picking a DFA state and duplicating it (preserve all incoming and leaving transitions), or maybe by generating N strings accepted by the DFA and unioning them back into it. Do that N times so N states get duplicated. This should not alter the language accepted by the automaton, but should make it very "N".

Finally, from the DFA, randomly enumerate strings it accepts, and then assert the NFARunAutomaton also accepts them. And, sometimes randomly generate random strings that are not accepted by the DFA, and confirm the NFA also does not accept them.

continue;
}
Query dfaQuery = new AutomatonQuery(new Term(FIELD), a);
Query nfaQuery = new AutomatonQuery(new Term(FIELD), a, 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very cool you are able to completely reuse AutomatonQuery to run either an NFA or DFA by using determinizeWorkLimit = 0 to mean "create an NFA", since you made the two interfaces. Versus forking and then requiring the user make an NFAAutomatonQuery.

But I think it's a little dangerous to take such a low-level approach? (Your // nocommit above). Because that approach means anyone who calls determinize with a 0 work limit will quietly get an NFA back, not just AutomatonQuery who knows how to handle the NFA properly. I.e. other places that call determinize expect to always get back a deterministic automaton? Maybe, instead, we could instead just change AutomatonQuery to take an optional determinizeUpFront boolean, which would default to true (preserving current behavior)?

Or, maybe keep the determinizeWorkLimit==0 to mean "use NFA", but move that if up from way down in determinize to higher up in AutomatonQuery?

@zhaih zhaih marked this pull request as ready for review September 21, 2021 15:33
Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change! I think it is close.

I would love to know how this new "determinize on demand" performs, but we have to fix luceneutil to count Query init cost, and then find some "realistic" regexps to test.

But since we are not changing the behavior of AutomatonQuery (it still determinizes up-front by default), I don't think we need to block pushing this (once we iterate on all feedback) on benchmark results?

runAutomaton = compiled.runAutomaton;
compiledAutomaton = compiled;
if (compiled.nfaRunAutomaton != null) {
this.runAutomaton = compiled.nfaRunAutomaton;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of having separate nfaRunAutomaton and runAutomaton we could have only runAutomaton and a separate CompiledAutomaton.isDeterminized boolean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitating on doing that since there might be some method still rely on the truth that runAutomaton is of type RunAutomaton but not only using the method abstracted by ByteRunnable.

Also it might be a bit easier to spot if the nfaRunAutomaton is having problem at this stage? Probably we want to merge it after the NFA one is more mature?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK we can wait on this. Maybe just add a comment explaining why we need this odd if still, here and in the other places where we did this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too would really prefer it if we can avoid the current "if". It starts to look like a "dance" in all the places where we do it. I don't understand about how it makes debugging easier, can't we just print automaton.isDeterministic() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm I checked it again and it's a bit complex to merge the runAutomaton and nfaRunAutomaton into one, because previously runAutomaon is having public access and kind of wildly used, it's type directly appears in some public API such as QueryVisitor#consumesTermsMatching, so it might take another PR to try to merge them, I left a todo for now.

An alternative way I'm thinking about is to move those if inside the CompiledAutomaton, and only expose methods like getByteRunnable so that people don't manipulate them outside, I'll include that in next commit.

* UTF32ToUTF8 conversion
* @param runnableType NFA or DFA
*/
public AutomatonQuery(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, so the existing ctor remains, defaulting to DFA execution strategy, where the automaton is first fully determinized.

But now you add another ctor, letting users also ask for NFA execution, where the automaton is determinized lazily on-demand and only in those parts that the terms in this index need to visit.

continue;
}
AutomatonQuery dfaQuery = new AutomatonQuery(new Term(FIELD), a);
AutomatonQuery nfaQuery = new AutomatonQuery(new Term(FIELD), a, ByteRunnable.TYPE.NFA);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a new LuceneTestCase method, newAutomatonQuery, and it would randomly pick between NFA and DFA type? And then in a few pre-existing tests, let's call newAutomatonQuery instead of new AutomatonQuery?

That method could also randomly make the automaton non-deterministic by simple cloning a few states? We can do this in a follow-on issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Let's delay it a bit to the next PR maybe (I plan to introduce NFARegexQuery in next PR probably).

Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is getting really close! Thanks @zhaih!

runAutomaton = compiled.runAutomaton;
compiledAutomaton = compiled;
if (compiled.nfaRunAutomaton != null) {
this.runAutomaton = compiled.nfaRunAutomaton;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK we can wait on this. Maybe just add a comment explaining why we need this odd if still, here and in the other places where we did this.

/**
* Determinize the automaton lazily on-demand as terms are intersected. This option saves the
* up-front determinize cost, and can handle some RegExps that DFA cannot, but intersection will
* be a bit slower
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing period at the end of the sentence?

Maybe link to Russ Cox's famous page (https://swtch.com/~rsc/regexp/regexp1.html) and point out that this is similar to the Thompson NFA approach described there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this page is linked in NFARunAutomaton's javadoc already :)

public interface ByteRunnable {

/** NFA or DFA */
enum TYPE {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add @lucene.experimental to the TYPE javadocs, and on each of the options (NFA, DFA)?

Also, could you pull out this enum into its own class under o.a.l.util.automaton, maybe AutomatonExecutionMode or AutomatonIntersectionMode or AutomatonExecutionStrategy (getting longish to type...)? I think it is too obscure living down inside this non-consumable (to Lucene users who don't know all sorts of details about automata) ByteRunnable now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I renamed it to RunAutomatonMode. How's that sounds like?

@zhaih
Copy link
Contributor Author

zhaih commented Oct 12, 2021

Hmmm one of the test failed

ERROR: The following test(s) have failed:
> Task :lucene:analysis:smartcn:test
  - org.apache.lucene.index.TestIndexFileDeleter.testExcInDecRef (:lucene:core)
:lucene:analysis:smartcn:test (SUCCESS): 21 test(s)
    Test output: /home/runner/work/lucene/lucene/lucene/core/build/test-results/test/outputs/OUTPUT-org.apache.lucene.index.TestIndexFileDeleter.txt

    Reproduce with: gradlew :lucene:core:test --tests "org.apache.lucene.index.TestIndexFileDeleter.testExcInDecRef" -Ptests.jvms=1 -Ptests.jvmargs=-XX:TieredStopAtLevel=1 -Ptests.seed=FD14DA9475FFAE2C -Ptests.slow=false -Ptests.file.encoding=UTF-8

But I can't reproduce locally...

@dweiss
Copy link
Contributor

dweiss commented Oct 12, 2021

The stack trace is interesting though - looks like access to a closed reader pool:

org.apache.lucene.index.TestIndexFileDeleter > testExcInDecRef FAILED
    org.apache.lucene.store.AlreadyClosedException: ReaderPool is already closed
        at __randomizedtesting.SeedInfo.seed([FD14DA9475FFAE2C:1489ADA6033649D1]:0)
        at app//org.apache.lucene.index.ReaderPool.get(ReaderPool.java:400)
        at app//org.apache.lucene.index.IndexWriter.writeReaderPool(IndexWriter.java:3742)
        at app//org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:590)
        at app//org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:474)
        at app//org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:406)
        at app//org.apache.lucene.index.TestIndexFileDeleter.testExcInDecRef(TestIndexFileDeleter.java:484)

@dweiss
Copy link
Contributor

dweiss commented Oct 12, 2021

I re-ran the jobs. The stack trace from the error does look suspicious though.

@zhaih
Copy link
Contributor Author

zhaih commented Oct 14, 2021

Thanks @dweiss seems this is not the first time we see this error: https://issues.apache.org/jira/browse/LUCENE-9839

@mikemccand
Copy link
Member

Thanks @dweiss seems this is not the first time we see this error: https://issues.apache.org/jira/browse/LUCENE-9839

Looks like this is (scarily!) pre-existing. I don't think it should block this change, and I cannot see anything here that might cause that scary exception.

Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow reply here! The change looks awesome @zhaih. It defaults execution the current approach (DFA for up front determinize) but opens the option to do NFA instead, which determinizes on demand. I left a tiny comment but otherwise I think this is ready. I think we should target 9.x / main for this.


New Features

* LUCENE-10010 Introduce NFARunAutomaton to run NFA directly. (Haoyu Zhai)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change to Patrick Zhai for consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nice catch!

@rmuir
Copy link
Member

rmuir commented Nov 29, 2021

But since we are not changing the behavior of AutomatonQuery (it still determinizes up-front by default), I don't think we need to block pushing this (once we iterate on all feedback) on benchmark results?

Can we at least run existing benchmarks to confirm it doesn't introduce regressions? Some of the code is performance sensitive and we are adding abstractions here. damn java.

I'm still confused about the API, as it adds lots of booleans/enums. Do we really need enums, or can a simple boolean suffice? Does AutomatonQuery really need such boolean, or should it just look at Automaton.isDeterministic to determine what to do? Depending on what performance shows, why even keep around a both determinizeWorkLimit and the booleans/enums? Instead of throwing an exception, why not fall back to NFA if you really want to run some crazy regexp?

@rmuir
Copy link
Member

rmuir commented Nov 29, 2021

I made a quick prototype with what i mean for the API: #485

The idea is that AutomatonQuery shouldn't be determinizing. Let's push this to the caller. If they pass it a DFA, it uses DFA algorithm. If they pass it NFA, it can use the NFA algorithm (it currently throws an exception in my branch, instead of slowly determinizing, that is the change).

@zhaih
Copy link
Contributor Author

zhaih commented Nov 29, 2021

Thanks @rmuir, I'll run a benchmark to ensure this PR does not introduce regression recently.

I like the approach you proposed in #485, it would be nice if we can get rid of determinizeWorkLimit in some classes that previously exists everywhere. One reason for carrying an enum and the determinizeWorkLimit together is that we might want to use that determinizeWorkLimit to limit the number of state that NFA can cache as well. But that's a feature not implemented yet and could be done in some other ways.

I think we can try to get that pushed and then I can rebase this one after.

@zhaih
Copy link
Contributor Author

zhaih commented Dec 2, 2021

@rmuir I've done the benchmark! (Sorry for the delay): and result looks plain (which is good for this change)

                    Task    QPS base      StdDev    QPS cand      StdDev                Pct diff p-value
            OrNotHighMed      523.45      (3.0%)      518.00      (1.7%)   -1.0% (  -5% -    3%) 0.176
                HighTerm     1385.51      (1.9%)     1374.10      (2.5%)   -0.8% (  -5% -    3%) 0.232
                  Fuzzy2       35.19      (5.9%)       34.91      (5.6%)   -0.8% ( -11% -   11%) 0.666
   HighTermDayOfYearSort      779.01      (2.5%)      772.93      (2.4%)   -0.8% (  -5% -    4%) 0.315
                 Respell       36.33      (1.3%)       36.05      (1.6%)   -0.8% (  -3% -    2%) 0.086
        HighSloppyPhrase        5.08      (6.2%)        5.04      (6.1%)   -0.8% ( -12% -   12%) 0.700
              AndHighLow      349.64      (3.1%)      347.17      (3.5%)   -0.7% (  -7% -    6%) 0.501
       HighTermMonthSort       37.82      (7.0%)       37.55      (6.9%)   -0.7% ( -13% -   14%) 0.751
                 LowTerm     1041.47      (2.5%)     1034.72      (2.1%)   -0.6% (  -5% -    3%) 0.367
            OrNotHighLow      558.55      (2.0%)      555.05      (2.3%)   -0.6% (  -4% -    3%) 0.358
              TermDTSort      149.25      (3.7%)      148.36      (3.3%)   -0.6% (  -7% -    6%) 0.592
         LowSloppyPhrase       46.63      (3.7%)       46.37      (3.9%)   -0.6% (  -7% -    7%) 0.647
           OrHighNotHigh      454.25      (1.6%)      452.05      (1.6%)   -0.5% (  -3% -    2%) 0.341
         MedSloppyPhrase       22.73      (2.4%)       22.63      (3.0%)   -0.4% (  -5% -    5%) 0.627
             AndHighHigh       20.19      (2.4%)       20.11      (2.0%)   -0.4% (  -4% -    4%) 0.564
              AndHighMed       42.22      (2.1%)       42.05      (2.0%)   -0.4% (  -4% -    3%) 0.549
            OrHighNotLow      578.56      (1.7%)      576.35      (2.8%)   -0.4% (  -4% -    4%) 0.599
    HighIntervalsOrdered       11.48      (4.0%)       11.44      (3.9%)   -0.4% (  -7% -    7%) 0.759
    HighTermTitleBDVSort        9.27      (2.8%)        9.23      (2.7%)   -0.4% (  -5% -    5%) 0.664
                 MedTerm      963.30      (2.1%)      959.98      (1.5%)   -0.3% (  -3% -    3%) 0.544
                Wildcard       83.18      (4.2%)       82.91      (4.4%)   -0.3% (  -8% -    8%) 0.811
     LowIntervalsOrdered       11.41      (3.2%)       11.38      (2.8%)   -0.3% (  -6% -    5%) 0.768
     MedIntervalsOrdered       16.77      (2.6%)       16.72      (2.3%)   -0.3% (  -5% -    4%) 0.722
            HighSpanNear        1.06      (1.0%)        1.05      (1.3%)   -0.2% (  -2% -    2%) 0.506
             LowSpanNear        3.06      (1.1%)        3.05      (1.4%)   -0.2% (  -2% -    2%) 0.599
BrowseDayOfYearSSDVFacets        2.88      (2.5%)        2.87      (2.6%)   -0.2% (  -5% -    5%) 0.827
           OrNotHighHigh      466.02      (2.3%)      465.36      (1.9%)   -0.1% (  -4% -    4%) 0.828
    BrowseDateTaxoFacets        0.84      (5.3%)        0.84      (4.4%)   -0.1% (  -9% -   10%) 0.933
                  IntNRQ       28.10      (0.7%)       28.08      (1.2%)   -0.1% (  -1% -    1%) 0.762
BrowseDayOfYearTaxoFacets        0.84      (5.3%)        0.84      (4.4%)   -0.1% (  -9% -   10%) 0.957
   BrowseMonthSSDVFacets        2.99      (3.5%)        2.99      (3.7%)   -0.1% (  -7% -    7%) 0.962
   BrowseMonthTaxoFacets        0.87      (5.6%)        0.87      (4.7%)   -0.0% (  -9% -   10%) 0.979
               OrHighLow      373.02      (3.0%)      372.89      (2.7%)   -0.0% (  -5% -    5%) 0.968
                PKLookup      131.84      (2.4%)      131.83      (2.9%)   -0.0% (  -5% -    5%) 0.992
                 Prefix3       93.52      (3.2%)       93.51      (3.5%)   -0.0% (  -6% -    6%) 0.997
              OrHighHigh       30.69      (2.1%)       30.69      (2.1%)    0.0% (  -4% -    4%) 0.989
             MedSpanNear       23.34      (1.3%)       23.35      (1.2%)    0.0% (  -2% -    2%) 0.958
                  Fuzzy1       44.19      (7.7%)       44.20      (8.2%)    0.0% ( -14% -   17%) 0.991
              HighPhrase       89.39      (2.0%)       89.43      (1.9%)    0.0% (  -3% -    4%) 0.943
               LowPhrase       27.84      (2.0%)       27.88      (2.6%)    0.1% (  -4% -    4%) 0.853
               OrHighMed       31.74      (1.9%)       31.80      (1.8%)    0.2% (  -3% -    3%) 0.763
               MedPhrase       82.20      (1.7%)       82.54      (2.4%)    0.4% (  -3% -    4%) 0.520
            OrHighNotMed      501.51      (1.7%)      505.78      (2.3%)    0.9% (  -3% -    4%) 0.181

}

@Override
public boolean equals(Object o) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EqualsGetClass: Overriding Object#equals in a non-final class by using getClass rather than instanceof breaks substitutability of subclasses. (details)
(at-me in a reply with help or ignore)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sonatype-lift ignore

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've recorded this as ignored for this pull request. If you change your mind, just comment @sonatype-lift unignore.

@rmuir
Copy link
Member

rmuir commented Dec 4, 2021

ok, see #513 which is another PR like the #485, just for RegExp class. all the trappy minimization is removed, it may return a DFA or NFA: and it is the callers choice to minimize. Combined with #485 I think it gives us the opportunity for a simple API, e.g. this stuff would just happen in one place (e.g. RegExpQuery class) rather than strewn all about.

@rmuir
Copy link
Member

rmuir commented Dec 8, 2021

ok, sorry, I realize the latest two PR-split-outs still don't solve your problem. The API is still ugly because we still minimize() regexps, and that implies determinize()... but maybe isn't obvious.

I don't think we need to minimize() any more, I opened a separate discussion: https://issues.apache.org/jira/browse/LUCENE-10296. I will try to prototype a PR.

Instead, the decisions on this issue should just be about DFA or NFA!

@rmuir
Copy link
Member

rmuir commented Dec 8, 2021

ok one more, but I think it sets us up even better: #528

try {
regExp = new RegExp(AutomatonTestUtil.randomRegexp(random()));
} catch (IllegalArgumentException e) {
ignoreException(e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also annotate the IllegalArgumentException with @SuppressWarnings("unused").

as far as the random regexp, I think it already does what you want? One tricky thing is, the strings it generates are only valid if you then build regexp with RegExp.NONE. That is probably what causes confusion here.

runAutomaton = compiled.runAutomaton;
compiledAutomaton = compiled;
if (compiled.nfaRunAutomaton != null) {
this.runAutomaton = compiled.nfaRunAutomaton;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too would really prefer it if we can avoid the current "if". It starts to look like a "dance" in all the places where we do it. I don't understand about how it makes debugging easier, can't we just print automaton.isDeterministic() ?

@zhaih
Copy link
Contributor Author

zhaih commented Dec 20, 2021

OK @rmuir some new commits are ready to be reviewed! Please take your time :)

Copy link
Member

@rmuir rmuir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all the hard work and patience here @zhaih !

@zhaih
Copy link
Contributor Author

zhaih commented Dec 20, 2021

Thank you @rmuir and @mikemccand for reviewing this big PR! I'll merge it myself :)

@zhaih zhaih merged commit c972c6a into apache:main Dec 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants