Skip to content

GaussianProcessRegressor unnecessarily forcing training samples to be array_like #13936

@yhtang

Description

@yhtang

The current interface precludes the application of the GPR module to cases where the training samples are sequences, trees, graphs etc. In fact, GPR should not care about whether its input are fixed-length feature vectors or not --- that should be handled by the kernel.

Expected usage scenario:

k = StructuredDataKernel()
gp = GaussianProcessRegressor( k )
X = [ object1, objects2, ... ]
Y = [ object3, objects4, ... ]
y = [ scalar_values... ]
gp.fit( X, y ) 
gp.predict( Y )

Error information (sklearn/utils/validation.py", line 448, in check_array):

ValueError: setting an array element with a sequence.

Expect outcome:
The learning workflow should proceed just fine as long as k returns valid inner product matrices between X and itself or Y.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions