-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Too many labels result in a crash #2800
Copy link
Copy link
Closed
Labels
bugBugs and behaviour differing from documentationBugs and behaviour differing from documentationfeat / nerFeature: Named Entity RecognizerFeature: Named Entity RecognizertrainingTraining and updating modelsTraining and updating models
Description
Hi, I'm currently trying to train a custom model with over 125 labels and I encounter the following error:
Windows 10
Process finished with exit code -1073740791 (0xC0000409)
Ubuntu 18.04
*** stack smashing detected ***: <unknown> terminated
Aborted (core dumped)
There seems to be a limit. Under 125 labels it works and over it, it crashes.
How to reproduce the behaviour
def __train_model(self, train_data, entity_types):
nlp = spacy.blank("en")
ner = nlp.create_pipe("ner")
nlp.add_pipe(ner)
for entity_type in list(entity_types):
ner.add_label(entity_type)
optimizer = nlp.begin_training()
# Start training
for i in range(20):
losses = {}
index = 0
random.shuffle(train_data)
for statement, entities in train_data:
nlp.update([statement], [entities], sgd=optimizer, losses=losses, drop=0.5)
return nlpUnit Test
def test_train_with_max_supported_entity_types(self):
train_data = TrainData()
train_data.extend([("One sentence", {"entities": []})])
entity_types = {i for i in range(125)}
model = self.train_model_processor.train(train_data, entity_types)
assert_is_not_none(model)So in the unit test whenever entity_types length is beyond 125, it crashes.
Your Environment
-
spaCy version: 2.0.12
-
Platform: Windows-10-10.0.16299-SP0
-
Python version: 3.7.0
-
Environment Information:
16gb RAM, CPU: i7-3630QM
Any idea if there is a limit of labels ? If so, should it return an error message describing the error instead of crashing ?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugBugs and behaviour differing from documentationBugs and behaviour differing from documentationfeat / nerFeature: Named Entity RecognizerFeature: Named Entity RecognizertrainingTraining and updating modelsTraining and updating models