Skip to content

Choice of words in documentation #13639

@tommyod

Description

@tommyod

Hi all,

I was browsing through the start of the the documentation, and didn't get far before noticing that quite a lot of synonyms are used in modules/linear_model.rst (html and source).

Example 1

In the introductory paragraph for the Lasso, three different words are used to describe the model coefficients. The below emphasis is mine.

The Lasso is a linear model that estimates sparse coefficients. It is useful in some contexts due to its tendency to prefer solutions with fewer parameter values, effectively reducing the number of variables upon which the given solution is dependent. For this reason, the Lasso and its variants are fundamental to the field of compressed sensing. Under certain conditions, it can recover the exact set of non-zero weights (see Compressive sensing: tomography reconstruction with L1 prior (Lasso)).

Example 2

The word features has a lot of synonyms. From the descriptions of Ridge, Lasso, and other related models, I found sentences such as:

  • "...datasets with many collinear regressors..."
  • "...linear combination of the input variables..."
  • "...selected features are the same ..."
  • "...it finds the predictor most correlated with the response..."
  • "...when the number of dimensions is significantly greater than..."

Questions

I worry that the synonyms might make the documentation unnecessarily complicated.

  • Do you agree with this? Should I clean it up in a PR?

If you agree, can you help me decide on preferred words? I get the impression that:

  • Features is preferred over regressors, variables, predictors and dimensions.
  • Samples is preferred over observations, points.
  • Values stored in coef_ are denoted w in the documentation. Should we prefer coefficients or weights? The word parameter is also used, but I prefer to reserve that for regularization (hyper)parameters such as alphain Ridge.

I apologize if this has been discussed in another Issue/PR, but I didn't find any related discussions.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions