synchronize MLXNN code with python implementation by davidkoski · Pull Request #340 · ml-explore/mlx-swift

davidkoski · 2026-01-16T21:17:22Z

Proposed changes

Looks like some changes made after the initial port that were missed on the swift side.

Checklist

Put an x in the boxes that apply.

I have read the CONTRIBUTING document
I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
I have added tests that prove my fix is effective or that my feature works
I have updated the necessary documentation (if needed)

davidkoski · 2026-01-16T21:17:52Z

Source/MLX/Transforms+Compile.swift

    _ f: @Sendable @escaping (MLXArray, MLXArray, MLXArray) -> MLXArray
 )
-    -> (MLXArray, MLXArray, MLXArray) -> MLXArray
+    -> @Sendable (MLXArray, MLXArray, MLXArray) -> MLXArray


Noticed this was missing while working on this (match the other compile() implementations)

davidkoski · 2026-01-16T21:18:05Z

Source/MLXNN/Activations.swift

+/// - Parameters:
+///   - x: input array
+///   - lambda: lambda value
+public func softshrink(_ x: MLXArray, lambda: Float = 0.5) -> MLXArray {


Several missing activations

davidkoski · 2026-01-16T21:18:20Z

Source/MLXNN/Activations.swift

+/// ### See Also
+/// - <doc:activations>
+/// - ``softmin(_:axis:)``
+open class Softmin: Module, UnaryLayer {


And their layers

davidkoski · 2026-01-16T21:19:42Z

Source/MLXNN/Convolution.swift

-            [outputChannels, kernelSize.first, kernelSize.second, kernelSize.third, inputChannels])
+            [
+                outputChannels, kernelSize.first, kernelSize.second, kernelSize.third,
+                inputChannels / groups,


inputChannels should be dived by groups. curiously the swift version has groups for all 3 convolutions but python only has it for 1d and 2d 🤷

We should fix that in Python.

davidkoski · 2026-01-16T21:20:07Z

Source/MLXNN/ConvolutionTransposed.swift

        kernelSize: Int,
        stride: Int = 1,
        padding: Int = 0,
+        outputPadding: Int = 0,


New parameter on the transposed convolutions

davidkoski · 2026-01-16T21:25:03Z

Source/MLXNN/Embedding.swift

    public init(embeddingCount: Int, dimensions: Int) {
        let scale = sqrt(1 / Float(dimensions))
-        self.weight = MLXRandom.normal([embeddingCount, dimensions]) * scale
+        self.weight = MLXRandom.normal([embeddingCount, dimensions], scale: scale)


Should have the same result but match the python implementation

davidkoski · 2026-01-16T21:25:26Z

Source/MLXNN/Normalization.swift

+    ///   - affine: if `true` adds a trainable `weight`
+    ///   - bias: if `true` adds a trainable `bias`
+    public init(
+        dimensions: Int, eps: Float = 1e-5, affine: Bool = true, bias: Bool = true


On the python side a bias flag was split out from affine.

davidkoski · 2026-01-16T21:25:48Z

Source/MLXNN/Pooling.swift

    public let kernelSize: [Int]
    public let stride: [Int]
+    public let padding: [Int]
+    public let paddingValue: Float


Add missing padding/paddingValue for base implementation

davidkoski · 2026-01-16T21:26:07Z

Source/MLXNN/Pooling.swift

+}
+
+/// Applies 3-dimensional max pooling.
+open class MaxPool3d: Pool {


Add missing pool 3d layers

davidkoski · 2026-01-16T21:26:34Z

Source/MLXNN/Upsample.swift

-private func nearestIndices(dimension: Int, scale: Float, dim: Int, ndim: Int) -> MLXArray {
-    scaledIndices(dimension: dimension, scale: scale, alignCorners: true, dim: dim, ndim: ndim)
-        .asType(.int32)
+private func nearestIndices(dimension N: Int, scale: Float, dim: Int, ndim: Int) -> MLXArray {


Match the python implementation, specifically the 0.5 offset below

awni

Awesome. That's a good diff! Thanks David and Claude ;)

davidkoski added 8 commits January 15, 2026 15:55

add missing activation functions and layers

eb76f9b

inputChannels should be divided by groups

e37890f

use the Random API

e6a1e65

add padding, missing pool3d layers

0435762

separate bias parameter

616f706

swift-format

201c6c5

add missing outputPadding

6a9e39d

make match current python version

0759d82

davidkoski requested a review from awni January 16, 2026 21:17

davidkoski commented Jan 16, 2026

View reviewed changes

formatting and more groups work

d409e7a

davidkoski commented Jan 16, 2026

View reviewed changes

awni approved these changes Jan 22, 2026

View reviewed changes

davidkoski merged commit 0a6df65 into main Jan 22, 2026
7 checks passed

davidkoski deleted the gaps branch January 22, 2026 19:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

synchronize MLXNN code with python implementation#340

synchronize MLXNN code with python implementation#340
davidkoski merged 9 commits intomainfrom
gaps

davidkoski commented Jan 16, 2026

Uh oh!

davidkoski Jan 16, 2026

Uh oh!

davidkoski Jan 16, 2026

Uh oh!

davidkoski Jan 16, 2026

Uh oh!

davidkoski Jan 16, 2026

Uh oh!

awni Jan 22, 2026

Uh oh!

davidkoski Jan 16, 2026

Uh oh!

davidkoski Jan 16, 2026

Uh oh!

davidkoski Jan 16, 2026

Uh oh!

davidkoski Jan 16, 2026

Uh oh!

davidkoski Jan 16, 2026

Uh oh!

davidkoski Jan 16, 2026

Uh oh!

awni left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

davidkoski commented Jan 16, 2026

Proposed changes

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

awni left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants