Skip to content

[MRG] Fix segfault in AgglomerativeClustering with read-only mmaps#12485

Merged
amueller merged 1 commit intoscikit-learn:masterfrom
rth:AgglomerativeClustering-segfault-rommap
Nov 6, 2018
Merged

[MRG] Fix segfault in AgglomerativeClustering with read-only mmaps#12485
amueller merged 1 commit intoscikit-learn:masterfrom
rth:AgglomerativeClustering-segfault-rommap

Conversation

@rth
Copy link
Copy Markdown
Member

@rth rth commented Oct 29, 2018

This fixes a segfault in AgglomerativeClustering with read-only mmaps that happens inside ward_tree when calling scipy.cluster.hierarchy.ward.

Closes #12483

(see the above issue for more details)

@rth rth changed the title Fix segfault in AgglomerativeClustering with read-only mmaps [MRG] Fix segfault in AgglomerativeClustering with read-only mmaps Oct 29, 2018
Copy link
Copy Markdown
Contributor

@eamanu eamanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jnothman
Copy link
Copy Markdown
Member

Non-regression test? Common test?

@rth
Copy link
Copy Markdown
Member Author

rth commented Oct 31, 2018

Non-regression test? Common test?

The failing test in #12483 is already a common test, and passes for other estimators but segfaults for this one with a given version of gcc in a particular environment. As I mentioned in #12483 (comment) the segfault is deterministic, but I was not able to reproduce it on passing that specific array to scipy's ward function. So I'm not sure what non regression tests could be done.

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Does it deserve a what's new in 0.20.1?

@rth
Copy link
Copy Markdown
Member Author

rth commented Nov 6, 2018

Does it deserve a what's new in 0.20.1?

I'm not convinced it does. The segfault only happens in very peculiar environement (gcc version, but it doesn't segfault on other systems with the same gcc versions) that we are not able to exactly determine, and only happens for read-only arrays and I have a hard time believing that it would affect users in practice. There is no reason to use joblib.Parallel with AgglomerativeClustering or using read-only mmaps.

@rth rth added this to the 0.20.1 milestone Nov 6, 2018
@amueller amueller merged commit 5196657 into scikit-learn:master Nov 6, 2018
@amueller
Copy link
Copy Markdown
Member

amueller commented Nov 6, 2018

think we don't need a whatsnew.

@rth rth deleted the AgglomerativeClustering-segfault-rommap branch November 6, 2018 16:56
thoo added a commit to thoo/scikit-learn that referenced this pull request Nov 7, 2018
* upstream/master:
  joblib 0.13.0 (scikit-learn#12531)
  DOC tweak KMeans regarding cluster_centers_ convergence (scikit-learn#12537)
  DOC (0.21) Make sure plot_tree docs are generated and fix link in whatsnew (scikit-learn#12533)
  ALL Add HashingVectorizer to __all__ (scikit-learn#12534)
  BLD we should ensure continued support for joblib 0.11 (scikit-learn#12350)
  fix typo in whatsnew
  Fix dead link to numpydoc (scikit-learn#12532)
  [MRG] Fix segfault in AgglomerativeClustering with read-only mmaps (scikit-learn#12485)
  MNT (0.21) OPTiCS change the default `algorithm` to `auto` (scikit-learn#12529)
  FIX SkLearn `.score()` method generating error with Dask DataFrames (scikit-learn#12462)
  MNT KBinsDiscretizer.transform should not mutate _encoder (scikit-learn#12514)
thoo added a commit to thoo/scikit-learn that referenced this pull request Nov 9, 2018
…ybutton

* upstream/master:
  FIX YeoJohnson transform lambda bounds (scikit-learn#12522)
  [MRG] Additional Warnings in case OpenML auto-detected a problem with dataset  (scikit-learn#12541)
  ENH Prefer threads for IsolationForest (scikit-learn#12543)
  joblib 0.13.0 (scikit-learn#12531)
  DOC tweak KMeans regarding cluster_centers_ convergence (scikit-learn#12537)
  DOC (0.21) Make sure plot_tree docs are generated and fix link in whatsnew (scikit-learn#12533)
  ALL Add HashingVectorizer to __all__ (scikit-learn#12534)
  BLD we should ensure continued support for joblib 0.11 (scikit-learn#12350)
  fix typo in whatsnew
  Fix dead link to numpydoc (scikit-learn#12532)
  [MRG] Fix segfault in AgglomerativeClustering with read-only mmaps (scikit-learn#12485)
  MNT (0.21) OPTiCS change the default `algorithm` to `auto` (scikit-learn#12529)
  FIX SkLearn `.score()` method generating error with Dask DataFrames (scikit-learn#12462)
  MNT KBinsDiscretizer.transform should not mutate _encoder (scikit-learn#12514)
thoo pushed a commit to thoo/scikit-learn that referenced this pull request Nov 14, 2018
…cikit-learn#12485)

This fixes a segfault in AgglomerativeClustering with read-only mmaps that happens inside `ward_tree` when calling `scipy.cluster.hierarchy.ward`.

Closes scikit-learn#12483

(see the above issue for more details)
thoo pushed a commit to thoo/scikit-learn that referenced this pull request Nov 14, 2018
…cikit-learn#12485)

This fixes a segfault in AgglomerativeClustering with read-only mmaps that happens inside `ward_tree` when calling `scipy.cluster.hierarchy.ward`.

Closes scikit-learn#12483

(see the above issue for more details)
jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Nov 14, 2018
…cikit-learn#12485)

This fixes a segfault in AgglomerativeClustering with read-only mmaps that happens inside `ward_tree` when calling `scipy.cluster.hierarchy.ward`.

Closes scikit-learn#12483

(see the above issue for more details)
jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Nov 14, 2018
…cikit-learn#12485)

This fixes a segfault in AgglomerativeClustering with read-only mmaps that happens inside `ward_tree` when calling `scipy.cluster.hierarchy.ward`.

Closes scikit-learn#12483

(see the above issue for more details)
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
…cikit-learn#12485)

This fixes a segfault in AgglomerativeClustering with read-only mmaps that happens inside `ward_tree` when calling `scipy.cluster.hierarchy.ward`.

Closes scikit-learn#12483

(see the above issue for more details)
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
…cikit-learn#12485)

This fixes a segfault in AgglomerativeClustering with read-only mmaps that happens inside `ward_tree` when calling `scipy.cluster.hierarchy.ward`.

Closes scikit-learn#12483

(see the above issue for more details)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Test suite segfault on Linux/x86_64/Python 3.7 with old GCC

4 participants