[BUG] Can't combine `make_reduction` with `HistGradientBoostingRegressor.categorical_features`

**Describe the bug**

scikit-learn's `HistGradientBoostingRegressor` accepts an argument `categorical_features`, instructing it to treat certain columns (in my exogenous data) as categorical. But the `make_reduction` function changes the columns so all the options for providing categorical features fail. To be specific:
* Providing a list of columns names: fails because a NumPy array is passed to `.fit()`, not the original dataframe
* Providing a boolean mask: fails because it should match the number of columns, but this is different after `make_reduction` has done its magic
* Providing the indexes of the columns: fails (sometimes silently) because the order of columns changes



**To Reproduce**


```python
from sktime.datasets import load_longley
from sktime.forecasting.compose import make_reduction
from sktime.forecasting.model_selection import temporal_train_test_split
from sklearn.ensemble import HistGradientBoostingRegressor

y, X = load_longley()
X["Year"] = X.index.year - 1900 # bad example, just to get a value that can be considered categorical
y_trn, y_tst, X_trn, X_tst = temporal_train_test_split(y, X, test_size=5)

forecaster = make_reduction(
    HistGradientBoostingRegressor(
        # categorical_features=["Year"],  # Strings
        # categorical_features=X.columns.isin(["Year"]),  # Boolean mask
        categorical_features=X.columns.get_indexer(["Year"]),  # Column positions
    ),
    window_length=2,
)

forecaster.fit(
    fh=[1, 2, 3],
    y=y_trn,
    X=X_trn,
)
```

**Expected behavior**

In a dream world, the column names I define in my DataFrame would still be there when `X` is eventually passed to `HistGradientBoostingRegressor.fit`.

Two solutions I can think of (with no regard for complexity or feasibility):
1. sktime only adds columns to the _end_ of the dataframe that the user provides as X (including adding y), so that we can at least rely on the column indexes remaining the same.
2. sktime keeps the dataframe and column names, only adding new columns.

2 Would be ideal as it solves other problems. For example XGBoost has the (experimental) ability to deal with Pandas categorical columns (`enable_categorical`), but as far as I can tell this wouldn't work with sktime because that information is lost in the conversion to NumPy.

**Additional context**


**Versions**
<details>

System:
    python: 3.10.8 (main, Oct 12 2022, 19:14:26) [GCC 9.4.0]
executable: /home/davidg/.virtualenvs/learning/bin/python
   machine: Linux-5.15.90.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Python dependencies:
          pip: 23.1.2
       sktime: 0.19.1
      sklearn: 1.2.2
        numpy: 1.24.3
        scipy: 1.10.1
       pandas: 2.0.1
   matplotlib: 3.7.0
       joblib: 1.2.0
  statsmodels: 0.13.5
        numba: None
     pmdarima: 2.0.3
      tsfresh: None
   tensorflow: None
tensorflow_probability: None

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Can't combine `make_reduction` with `HistGradientBoostingRegressor.categorical_features` #4776

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[BUG] Can't combine make_reduction with HistGradientBoostingRegressor.categorical_features #4776

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[BUG] Can't combine `make_reduction` with `HistGradientBoostingRegressor.categorical_features` #4776