synchronize MLXNN code with python implementation#340
Conversation
| _ f: @Sendable @escaping (MLXArray, MLXArray, MLXArray) -> MLXArray | ||
| ) | ||
| -> (MLXArray, MLXArray, MLXArray) -> MLXArray | ||
| -> @Sendable (MLXArray, MLXArray, MLXArray) -> MLXArray |
There was a problem hiding this comment.
Noticed this was missing while working on this (match the other compile() implementations)
| /// - Parameters: | ||
| /// - x: input array | ||
| /// - lambda: lambda value | ||
| public func softshrink(_ x: MLXArray, lambda: Float = 0.5) -> MLXArray { |
There was a problem hiding this comment.
Several missing activations
| /// ### See Also | ||
| /// - <doc:activations> | ||
| /// - ``softmin(_:axis:)`` | ||
| open class Softmin: Module, UnaryLayer { |
There was a problem hiding this comment.
And their layers
| [outputChannels, kernelSize.first, kernelSize.second, kernelSize.third, inputChannels]) | ||
| [ | ||
| outputChannels, kernelSize.first, kernelSize.second, kernelSize.third, | ||
| inputChannels / groups, |
There was a problem hiding this comment.
inputChannels should be dived by groups. curiously the swift version has groups for all 3 convolutions but python only has it for 1d and 2d 🤷
| kernelSize: Int, | ||
| stride: Int = 1, | ||
| padding: Int = 0, | ||
| outputPadding: Int = 0, |
There was a problem hiding this comment.
New parameter on the transposed convolutions
| public init(embeddingCount: Int, dimensions: Int) { | ||
| let scale = sqrt(1 / Float(dimensions)) | ||
| self.weight = MLXRandom.normal([embeddingCount, dimensions]) * scale | ||
| self.weight = MLXRandom.normal([embeddingCount, dimensions], scale: scale) |
There was a problem hiding this comment.
Should have the same result but match the python implementation
| /// - affine: if `true` adds a trainable `weight` | ||
| /// - bias: if `true` adds a trainable `bias` | ||
| public init( | ||
| dimensions: Int, eps: Float = 1e-5, affine: Bool = true, bias: Bool = true |
There was a problem hiding this comment.
On the python side a bias flag was split out from affine.
| public let kernelSize: [Int] | ||
| public let stride: [Int] | ||
| public let padding: [Int] | ||
| public let paddingValue: Float |
There was a problem hiding this comment.
Add missing padding/paddingValue for base implementation
| } | ||
|
|
||
| /// Applies 3-dimensional max pooling. | ||
| open class MaxPool3d: Pool { |
There was a problem hiding this comment.
Add missing pool 3d layers
| private func nearestIndices(dimension: Int, scale: Float, dim: Int, ndim: Int) -> MLXArray { | ||
| scaledIndices(dimension: dimension, scale: scale, alignCorners: true, dim: dim, ndim: ndim) | ||
| .asType(.int32) | ||
| private func nearestIndices(dimension N: Int, scale: Float, dim: Int, ndim: Int) -> MLXArray { |
There was a problem hiding this comment.
Match the python implementation, specifically the 0.5 offset below
awni
left a comment
There was a problem hiding this comment.
Awesome. That's a good diff! Thanks David and Claude ;)
Proposed changes
Looks like some changes made after the initial port that were missed on the swift side.
Checklist
Put an
xin the boxes that apply.pre-commit run --all-filesto format my code / installed pre-commit prior to committing changes