Summary
AbstractTrainer.train() (mode 0 — train only) always reads Wapiti training parameters (epsilon, window, nbMaxIterations) from GrobidProperties (i.e. grobid.yaml) and silently ignores any values previously set on the trainer instance via setParams(). The other training modes do not have this problem.
Inconsistency in the current code
AbstractTrainer already holds instance fields for these parameters and exposes setParams() to override them at runtime (line 70):
public void setParams(double epsilon, int window, int nbMaxIterations) {
this.epsilon = epsilon;
this.window = window;
this.nbMaxIterations = nbMaxIterations;
}
splitTrainEvaluate() (mode 2) and nFoldEvaluate() (mode 3) correctly honour these instance fields as overrides:
// AbstractTrainer.java — splitTrainEvaluate() and nFoldEvaluate()
if (epsilon != 0.0)
trainer.setEpsilon(epsilon);
if (window != 0)
trainer.setWindow(window);
if (nbMaxIterations != 0)
trainer.setNbMaxIterations(nbMaxIterations);
But train() (mode 0) ignores them entirely and always uses the config values:
// AbstractTrainer.java — train() — BUG: instance overrides silently dropped
trainer.setEpsilon(GrobidProperties.getEpsilon(model));
trainer.setWindow(GrobidProperties.getWindow(model));
trainer.setNbMaxIterations(GrobidProperties.getNbMaxIterations(model));
Proposed patch
Apply the same override pattern that modes 2 and 3 already use:
// AbstractTrainer.java — train()
trainer.setEpsilon(epsilon != 0.0 ? epsilon : GrobidProperties.getEpsilon(model));
trainer.setWindow(window != 0 ? window : GrobidProperties.getWindow(model));
trainer.setNbMaxIterations(nbMaxIterations != 0 ? nbMaxIterations : GrobidProperties.getNbMaxIterations(model));
This is a one-liner change per parameter, fully backwards compatible: when setParams() has not been called, the instance fields default to 0 / 0.0, so the config values are used exactly as before.
Use case
This fix enables callers that construct a trainer programmatically to supply runtime parameters without modifying grobid.yaml. A concrete downstream use case is exposing epsilon and nbMaxIterations as fields on an HTTP training API, so that the convergence threshold can be tuned per request — for example using a looser epsilon during iterative training-data development and a tighter one for production model builds.
It also makes the behaviour of mode 0 consistent with modes 2 and 3, which already support this pattern.
Affected file
grobid-trainer/src/main/java/org/grobid/trainer/AbstractTrainer.java, lines 92–94.
Summary
AbstractTrainer.train()(mode 0 — train only) always reads Wapiti training parameters (epsilon,window,nbMaxIterations) fromGrobidProperties(i.e.grobid.yaml) and silently ignores any values previously set on the trainer instance viasetParams(). The other training modes do not have this problem.Inconsistency in the current code
AbstractTraineralready holds instance fields for these parameters and exposessetParams()to override them at runtime (line 70):splitTrainEvaluate()(mode 2) andnFoldEvaluate()(mode 3) correctly honour these instance fields as overrides:But
train()(mode 0) ignores them entirely and always uses the config values:Proposed patch
Apply the same override pattern that modes 2 and 3 already use:
This is a one-liner change per parameter, fully backwards compatible: when
setParams()has not been called, the instance fields default to0/0.0, so the config values are used exactly as before.Use case
This fix enables callers that construct a trainer programmatically to supply runtime parameters without modifying
grobid.yaml. A concrete downstream use case is exposingepsilonandnbMaxIterationsas fields on an HTTP training API, so that the convergence threshold can be tuned per request — for example using a looser epsilon during iterative training-data development and a tighter one for production model builds.It also makes the behaviour of mode 0 consistent with modes 2 and 3, which already support this pattern.
Affected file
grobid-trainer/src/main/java/org/grobid/trainer/AbstractTrainer.java, lines 92–94.