min. number of examples check for the the learning_curve method #631

srhrshr · 2020-10-27T09:10:52Z

@desilinguist , this PR closes #624 - I've added a hard minimum number of examples constraint for the learning_curve method. Let me know if this is what you had in mind.

codecov · 2020-10-27T10:30:34Z

Codecov Report

Merging #631 into main will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #631   +/-   ##
=======================================
  Coverage   95.10%   95.10%           
=======================================
  Files          27       27           
  Lines        3083     3087    +4     
=======================================
+ Hits         2932     2936    +4     
  Misses        151      151

Impacted Files	Coverage Δ
skll/learner/__init__.py	`96.26% <100.00%> (+0.02%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2add4bf...6ed1084. Read the comment docs.

desilinguist · 2020-10-27T13:25:21Z

skll/learner/__init__.py

                       cv_folds=10,
-                       train_sizes=np.linspace(0.1, 1.0, 5)):
+                       train_sizes=np.linspace(0.1, 1.0, 5),
+                       min_training_examples=500):


Hmm, I am not sure we want to make this an argument? If we know that less than 500 examples lead to unreliable curves, why not simply hardcode that and add an argument like override_minimum which is False by default but users can choose to override it if they really want to. So, in this case, we would raise a ValueError if the number of features doesn't meet the 500 minimum. We would downgrade that error to a warning though if the user sets override_minimum to True.

What do you guys think @aoifecahill @mulhod @bndgyawali ?

BTW, whatever we end up doing, we would need to add new tests.

Yeah, override_minimum sounds good to me.

pep8speaks · 2020-10-27T16:10:52Z

Hello @srhrshr! Thanks for updating this PR.

In the file tests/test_output.py:

Line 91:28: E127 continuation line over-indented for visual indent

Comment last updated at 2020-10-28 16:20:44 UTC

desilinguist · 2020-10-27T16:15:20Z

skll/learner/__init__.py

+        override_minimum : bool, optional
+            Should this be True, learning curve would be generated
+            even with less than ideal number of `examples` (500).
+            However, by default, if the number of `examples` in the FeatureSet is
+            less than 500, an exception is raised, because
+            learning curves can be very unreliable
+            for very small sizes esp. if you have > 2 labels.
+            Defaults to False.


Suggested change

override_minimum : bool, optional

Should this be True, learning curve would be generated

even with less than ideal number of `examples` (500).

However, by default, if the number of `examples` in the FeatureSet is

less than 500, an exception is raised, because

learning curves can be very unreliable

for very small sizes esp. if you have > 2 labels.

Defaults to False.

override_minimum : bool, optional

Learning curves can be unreliable for very small sizes

esp. for > 2 labels. If this option is set to ``True``, the

learning curve would be generated even if the number

of example is less 500 along with a warning. If ``False``,

the curve is not generated and an exception is raised instead.

Defaults to ``False``.

skll/learner/__init__.py

tests/test_custom_learner.py

desilinguist · 2020-10-27T16:27:48Z

tests/test_custom_learner.py

+
+    # this must throw an error because `examples` has less than 500 items
+    _ = learner.learning_curve(examples=train_fs_less_than_500, metric='accuracy',
+                               override_minimum=False)


I think we don't want to specify override_minimum here since the goal is to check that the error is raised by default.

I think we need another tests, where we specify override_minimum as True, and then check that the warning was output to the log file. I believe we have other tests that already check for warnings in the log so you should be able to adapt those.

That makes sense - I've updated this test and have also added a separate test to check that the warning is raised: https://github.com/EducationalTestingService/skll/pull/631/files#diff-d8e9ffeba7d07b6f722f054d5a627f2582ba38b8b90ea090fb38d30cea565936R261-R292

mulhod

Looks great! I'm not sure why the build has not yet completed, though. Looks like it's stuck in queued status. We might need to manually run it again.

Update: Actually, it does seem to be running now.

desilinguist

We are almost there @srhrshr ! Thanks for being so receptive to our suggestions 🙇🏽.

desilinguist · 2020-10-27T22:05:31Z

tests/test_custom_learner.py

+@raises(ValueError)
+def test_custom_learner_learning_curve_min_examples():
+    """
+    Test to check learning curve raises error with less than 500 examples
+    """
+    # generates a training split with less than 500 examples
+    train_fs_less_than_500, _ = make_classification_data(num_examples=499,
+                                                         train_test_ratio=1.0,
+                                                         num_labels=3)
+
+    # creating an example learner
+    learner = Learner('LogisticRegression')
+
+    # this must throw an error because `examples` has less than 500 items
+    _ = learner.learning_curve(examples=train_fs_less_than_500, metric='accuracy')
+
+
+def test_custom_learner_learning_curve_min_examples_override():
+    """
+    Test to check learning curve displays warning with less than 500 examples
+    """
+
+    # creates a logger which writes to a temporary log file
+    log_dir = join(_my_dir, 'log')
+    log_file = NamedTemporaryFile("w", delete=False, dir=log_dir)
+    logger = get_skll_logger("test_custom_learner_learning_curve_min_examples_override",
+                             filepath=log_file.name)
+
+    # generates a training split with less than 500 examples
+    train_fs_less_than_500, _ = make_classification_data(num_examples=499,
+                                                         train_test_ratio=1.0,
+                                                         num_labels=3)
+
+    # creating an example learner
+    learner = Learner('LogisticRegression', logger=logger)
+
+    # this must throw an error because `examples` has less than 500 items
+    _ = learner.learning_curve(examples=train_fs_less_than_500, metric='accuracy',
+                               override_minimum=True)
+
+    # checks that the learning_curve warning message is contained in the log file
+    with open(log_file.name) as tf:
+        log_text = tf.read()
+        learning_curve_warning_re = \
+            re.compile(r'Because the number of training examples provided - '
+                       r'\d+ - is less than the ideal minimum - \d+ - '
+                       r'learning curve generation is unreliable'
+                       r' and might break')
+        assert learning_curve_warning_re.search(log_text)


These look great @srhrshr ! One minor thing: is there a reason why these test are in test_custom_learner.py rather than test_output.py where the other learning curve tests are? These don't have anything to do with custom learners right?

That's a good point - I didn't realize there were learning curve tests as I was looking for tests of the Learner* class all this while 🤦 Will move the tests to test_output.py

Done @desilinguist !

desilinguist · 2020-10-27T22:07:27Z

tests/test_custom_learner.py

 from skll.learner import Learner
 from skll.utils.constants import KNOWN_DEFAULT_PARAM_GRIDS
-
+from skll.utils.logging import (get_skll_logger)


Suggested change

from skll.utils.logging import (get_skll_logger)

from skll.utils.logging import get_skll_logger

desilinguist · 2020-10-27T22:16:11Z

tests/test_custom_learner.py

                        glob(join(output_dir, 'test_model_custom_learner_*'))):
        os.unlink(output_file)

+    for log_file in glob(join(log_dir, '*')):


I think rather than deleting all log files which could affect other tests, we should only delete the specific log file we are actually generating.

Done! I've also done a little bit of cleaning up to the tearDown function and instead of creating a new log dir, I've written the log file to an output dir, since the log files are being written to the output dir in a few other places.

Thanks for doing this. One minor nit I want to pick is that I am not a fan of single letter variable names and nested list comprehensions when they aren't necessary and sacrifice readability. I think in this case, nested for loops would work just fine.

I totally get that - this practice of mine has always been a little polarizing :P
Made this change.

… work

Co-authored-by: Nitin Madnani <nmadnani@gmail.com>

…rning_curve

…e deletion a bit

…o a loop

desilinguist · 2020-10-28T13:13:30Z

Seems like a missing import?

desilinguist · 2020-10-28T15:33:37Z

Hmm, I am seeing a partial failure on the windows builds:

mulhod · 2020-10-28T15:56:20Z

tests/test_output.py

        for output_file in glob(join(output_dir, 'test_{}_*'.format(suffix))) \
-            + glob(join(output_dir, 'test_{}.log'.format(suffix))) \
-                + glob(join(output_dir, 'test_majority_class_custom_learner_*')):
+                           + glob(join(output_dir, 'test_{}.log'.format(suffix))):


It looks like this isn't a glob pattern? I think it could be dealt with elsewhere. I'm not sure if this has anything to do with the build failure, but it's probably better to remove it above (or below) if it exists.

I don't think this was the issue. But line 90 ('test_{}_*'.format(suffix)))) was also matching the log filename of my test, despite me trying to be super specific about the filename I used. I've changed it now.

Btw, is the tearDown method called after each test? Because, that would explain the race condition.

Yes, I believe that's right.

tests/test_output.py

mulhod

The file handle thing needs to be fixed. It's causing the build to fail.

mulhod

LGTM!

desilinguist

Finally! Looks great! Excellent work @srhrshr !

srhrshr mentioned this pull request Oct 27, 2020

Learning curve breaks for very small data sizes #624

Closed

desilinguist requested changes Oct 27, 2020

View reviewed changes

desilinguist reviewed Oct 27, 2020

View reviewed changes

skll/learner/__init__.py Outdated Show resolved Hide resolved

desilinguist reviewed Oct 27, 2020

View reviewed changes

skll/learner/__init__.py Outdated Show resolved Hide resolved

desilinguist requested changes Oct 27, 2020

View reviewed changes

skll/learner/__init__.py Show resolved Hide resolved

desilinguist reviewed Oct 27, 2020

View reviewed changes

tests/test_custom_learner.py Outdated Show resolved Hide resolved

desilinguist requested changes Oct 27, 2020

View reviewed changes

mulhod approved these changes Oct 27, 2020

View reviewed changes

desilinguist requested changes Oct 27, 2020

View reviewed changes

srhrshr and others added 9 commits October 27, 2020 15:40

adds a check for the minimum number of examples for learning_curve to…

23e6367

… work

changes min_training_examples to override_minimum and adds a test

10b28f4

complies with pep8

8939406

Update skll/learner/__init__.py

dc56e20

Co-authored-by: Nitin Madnani <nmadnani@gmail.com>

fixes the docstring

59f1e13

changes formatting to use f-strings

375eb00

Update tests/test_custom_learner.py

182b19c

Co-authored-by: Nitin Madnani <nmadnani@gmail.com>

adds a test to check warning is raised when override is passed to lea…

4fbb384

…rning_curve

moves learning_curve tests to their right place and refactors log fil…

b546646

…e deletion a bit

srhrshr force-pushed the learning_curve_check branch from 85fd735 to b546646 Compare October 27, 2020 23:33

removes single char variables and changes nested list comprehension t…

e1f107f

…o a loop

adds missing import

772a5eb

mulhod reviewed Oct 28, 2020

View reviewed changes

tests/test_output.py Show resolved Hide resolved

mulhod suggested changes Oct 28, 2020

View reviewed changes

makes log file name not start with valid task suffix

6ed1084

mulhod approved these changes Oct 28, 2020

View reviewed changes

desilinguist approved these changes Oct 28, 2020

View reviewed changes

desilinguist merged commit 7a7794f into EducationalTestingService:main Oct 28, 2020

	from skll.utils.logging import (get_skll_logger)
	from skll.utils.logging import get_skll_logger

min. number of examples check for the the learning_curve method #631

min. number of examples check for the the learning_curve method #631

Uh oh!

Conversation

srhrshr commented Oct 27, 2020 • edited by desilinguist Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Oct 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pep8speaks commented Oct 27, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2020-10-28 16:20:44 UTC

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mulhod left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

desilinguist left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

desilinguist commented Oct 28, 2020

Uh oh!

desilinguist commented Oct 28, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mulhod left a comment

Choose a reason for hiding this comment

Uh oh!

mulhod left a comment

Choose a reason for hiding this comment

Uh oh!

desilinguist left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

srhrshr commented Oct 27, 2020 •

edited by desilinguist

Loading

codecov bot commented Oct 27, 2020 •

edited

Loading

pep8speaks commented Oct 27, 2020 •

edited

Loading

mulhod left a comment •

edited

Loading