Row Order model definitions by dhiltgen · Pull Request #9115 · ollama/ollama

dhiltgen · 2025-02-14T17:22:08Z

Replaces #8731 on main.

This change switches the model API (and backend) to be row-order to make it easier to port model definitions from other frameworks that use row-order patterns. I've made the following changes to the Backend API interface definitions:

All APIs assume row-order instead of column order
View no longer interleaves shape and stride - passed as two discrete int arrays

This requires a number of notable changes in the GGML backend:

Dimensions and shapes exposed in the API are reversed in the underlying GGML tensor to ensure operations work properly
Number of dimensions tracked in wrapper tensor type. GGML treats trailing dimensions of 1 as no-ops, so in order to retain the correct number of dimensions if the leading dimension has a shape of 1 (thus reversed to be the trailing dimension), this tracking is used instead of the underlying GGML reported number of dimensions.
Permute revamped to be consistent with other row-order APIs (pytorch, etc.). GGML treats the shape as the "destination" on where to move to. Other APIs (and ours with this change) treat the shape as the "source" on where to get the data from.
Reshape updated to support a -1 as a dimension consistent with other APIs, where the value will be calculated and filled in automatically.

Other potential refinements that aren't currently included but which may make sense:

Soften the "must have 4 dimensions" parameters to routines to be more consistent with other APIs and only require to match the actual number of dimensions in the tensor
Switch to Matmul pattern

The cache also required some adjustments based on these changes.

jessegross · 2025-03-22T00:16:50Z

kvcache/causal.go

 	}

-	maskTensor, err := ctx.Input().FromFloatSlice(mask, length, batchSize)
+	maskTensor, err := ctx.Input().FromFloatSlice(mask, batchSize, length)


I think there is an issue because we are swapping the order of dimensions here but the actual mask data is still laid out in the original order:
https://github.com/dhiltgen/ollama/blob/1296b3999ec5d4c15f32f5ac8311da94cdb808c4/kvcache/causal.go#L243

This works for GGML because the mask is in its native format and we just swap the arguments of FromFloatSlice back. However, it's probably at least part of the cache drift issue in MLX since the mask is not actually row order.

Most of the other inputs (which are the ones that aren't in the backend's native format) are only a single dimension, so the order doesn't make a difference. However, the mask is 2D.

I think we should change the mask generation to be row order native and in GGML do a permute in FromFloatSlice and FromIntSlice for multidimensional tensors. For the mask specifically, we may not actually need a contiguous, which would make it fast, though that is probably not generically true for all inputs.

dhiltgen · 2025-03-27T17:49:37Z

Moving back to draft status.

Matmul has replaced Mulmat, and now conforms to the behavior of pytorch matmul, however the current implementation has a significant performance hit. Once I can get it back to comparable performance, I'll take it back out of draft.

This change switches the model API (and backend) to be row-order to make it easier to port model definitions from other frameworks that use row-order patterns.

reneleonhardt · 2025-04-24T19:48:18Z

@dhiltgen Could other Ollama engineers help? 🙂

TomLucidor · 2025-12-17T02:35:51Z

Considers how this is blocking the MLX code, please move this forward soon

dhiltgen · 2026-02-24T04:35:36Z

Obsoleted by the new MLX based engine.

TomLucidor · 2026-02-24T04:57:57Z

@dhiltgen where is this MLX-based engine?

This was referenced Feb 14, 2025

Draft: Row Order model definitions #8731

Closed

Draft MLX go backend for new engine #9118

Closed

dhiltgen force-pushed the row_order branch 10 times, most recently from 46effa1 to bfed2c4 Compare March 3, 2025 22:50

dhiltgen force-pushed the row_order branch 8 times, most recently from 89f3ea2 to 02e2ab6 Compare March 12, 2025 22:42

dhiltgen marked this pull request as ready for review March 12, 2025 22:50

dhiltgen changed the title ~~Draft: Row Order model definitions~~ Row Order model definitions Mar 12, 2025

dhiltgen force-pushed the row_order branch 2 times, most recently from 7b3c313 to 1296b39 Compare March 20, 2025 15:26

jessegross reviewed Mar 22, 2025

View reviewed changes

dhiltgen force-pushed the row_order branch 5 times, most recently from 7fe73e1 to 909b23a Compare March 27, 2025 17:47

dhiltgen marked this pull request as draft March 27, 2025 17:48

dhiltgen force-pushed the row_order branch 2 times, most recently from af75cd7 to e809a68 Compare April 1, 2025 00:08

Row Order model definitions

72abcaa

This change switches the model API (and backend) to be row-order to make it easier to port model definitions from other frameworks that use row-order patterns.

dhiltgen force-pushed the row_order branch from e809a68 to 72abcaa Compare April 3, 2025 23:34

dhiltgen closed this Feb 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Row Order model definitions#9115

Row Order model definitions#9115
dhiltgen wants to merge 1 commit intoollama:mainfrom
dhiltgen:row_order

dhiltgen commented Feb 14, 2025 •

edited

Loading

Uh oh!

jessegross Mar 22, 2025

Uh oh!

dhiltgen commented Mar 27, 2025

Uh oh!

reneleonhardt commented Apr 24, 2025

Uh oh!

TomLucidor commented Dec 17, 2025

Uh oh!

dhiltgen commented Feb 24, 2026

Uh oh!

TomLucidor commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

dhiltgen commented Feb 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jessegross Mar 22, 2025

Choose a reason for hiding this comment

Uh oh!

dhiltgen commented Mar 27, 2025

Uh oh!

reneleonhardt commented Apr 24, 2025

Uh oh!

TomLucidor commented Dec 17, 2025

Uh oh!

dhiltgen commented Feb 24, 2026

Uh oh!

TomLucidor commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dhiltgen commented Feb 14, 2025 •

edited

Loading