Skip to content

Conversation

@MahdiBM
Copy link

@MahdiBM MahdiBM commented Jan 4, 2026

Short title (summary):

Enable seamless Swift interoperability with this package.

Description

Let's preface this with saying that I'm not a C++ developer, and I'll be happy to adjust the code based on your feedback or even nitpicks.

The idea of making simdutf swift-compatible came from this discussion in apple/swift-nio.

This PR is to open up the discussion with a POC, and see if these changes are acceptable or not.
If you prefer to have the discussion elsewhere, we can discuss this in that place, but I wanted to have something to show, and not just discuss theories.

What I've done

  • Add Package.swift so SwiftPM can recognize and compile this library.
  • Add a modulemap, again for SwiftPM. To be verified if this is actually required or we can get rid of it.
  • As an example, modify validate_utf16_as_ascii so Swift can synthesize it via its Span types.
    • That means using these 2 attributes: __counted_by(len), __noescape
    • __counted_by(len) declares that the amount of bytes in that pointer is declared by the variable len.
    • __noescape declares that the pointer will not outlive the function call. So Swift can deallocate it as it wishes.
    • Both are required for Swift to synthesize code using Swift Spans.
    • Without these 2 attributes, Swift will only synthesize code via UnsafePointer family of types and len, instead of a Span.

What remains to be done in my view:

  • Perhaps guard __counted_by and __noescape behind some custom attributes where they are available at all?
    • Again, I'm not sure if this is actually needed. Not a C/C++ dev.
  • Add Swift tests to ensure Span compatibility. I can set up the CI as well.
    • We can only have Linux CI, or we can have both Linux and macOS. Depending on what you'd like. Optimally both.
    • I haven't tested this on Windows, but I can if required. It should work fine in theory. In any case it shouldn't be a blocker.
  • Ask around to see why Swift wasn't synthesizing Swift Span types from std::span when I was trying it earlier.
    • We'd still need to add __noescape attributes to std::span arguments, so it'll still require some changes.
  • I need to double-check if it's required to require Swift 6.2.
    • Perhaps we can require Swift 6 only, and on Swift 6.2 and above Swift can synthesize the Span versions of the functions as well.
    • Spans and C++ interop are new to Swift in Swift 6.2, which is the current latest minor version.
  • Docs update.

Other notes

When we're close to be done with this PR, I can ask some Apple folks to review the PR as well, considering C++ interoperability (unlike C) is new in Swift.
They might not be able to directly comment in this PR/repository, but I can open up a discussion in Swift forums.

Swift's ecosystem also tries to follow Sem Ver 2.0, which I think is the same thing that this library aims for. So we should be good on that front as well.

Type of change

  • Bug fix
  • Optimization
  • New feature
  • Refactor / cleanup
  • Documentation / tests
  • Other (please describe):

How to verify / test

Adding tests remains to be done.

Watch this WWDC25 video for some info about C++ interoperability in Swift, if you're curious:
https://developer.apple.com/videos/play/wwdc2025/311/
You can also just read the transcript and the 2 articles linked in "Resources".

I'm currently testing these in a package of mine on macOS.
swift-idna - branch mmbm-simdutf
See the branch's small diff compared to main.
Build via swift build.
If you don't have Swift installed, install Swift 6.2.3 via swiftly, or another way of your choice.
If you'd like, open swift-idna in VSCode (or any of the forks) and install the official Swift extension.

I've yet to get things building on Linux, looks like there are some C++ interop issues there? I personally need to be able to use simdutf on Linux as well, so I'll get that working one way or another.

Please read before contributing:

If you can, we recommend running our tests with the sanitizers turned on.
For non-Visual Studio users, it is as easy as doing:

cmake -B build -D SIMDUTF_SANITIZE=ON
cmake --build build
ctest --test-dir build

Our CI checks, among other things, for trailing whitespace.

Checklist before submitting

  • I added/updated tests covering my change (if applicable)
  • Code builds locally and passes my check
  • Documentation / README updated if needed
  • Commits are atomic and messages are clear
  • I linked the related issue (if applicable)

Final notes

  • For large PRs, prefer smaller incremental PRs or request staged review.

Thanks for the contribution!

Copy link
Collaborator

@pauldreik pauldreik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to learn what changes are needed to make this work.

adding the Package.swift and module.modulemap does not seem problematic.

Would it be implementable by adding a SIMDUTF_COUNTED_BY() macro that expands to the needed attributes?

Even if that is the case, making the header files even more unreadable is unattractive to me.

Can this be solved in an easier way by instead providing a separate auto generated header file, where you can use all the attributes and swift specific code you wish?

#include <type_traits>
#include <span>
#include <tuple>
#include <lifetimebound.h>
Copy link
Collaborator

@pauldreik pauldreik Jan 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a standard header, where is it defined? I can't find it on my system using clang 21 and gcc 15.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Tim, in the forums post:

It’s in {swift toolchain location}/usr/lib/clang/21/include/ so not available if not running from Swift

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cat swift-6.3-DEVELOPMENT-SNAPSHOT-2025-12-24-a.xctoolchain/usr/lib/clang/21/include/lifetimebound.h 
/*===---- lifetimebound.h - Lifetime attributes -----------------------------===
 *
 * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
 * See https://llvm.org/LICENSE.txt for license information.
 * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 *
 *===------------------------------------------------------------------------===
 */

#ifndef __LIFETIMEBOUND_H
#define __LIFETIMEBOUND_H

#if defined(__cplusplus) && defined(__has_cpp_attribute)
#define __use_cpp_spelling(x) __has_cpp_attribute(x)
#else
#define __use_cpp_spelling(x) 0
#endif

#if __use_cpp_spelling(clang::lifetimebound)
#define __lifetimebound [[clang::lifetimebound]]
#else
#define __lifetimebound __attribute__((lifetimebound))
#endif

#if __use_cpp_spelling(clang::lifetime_capture_by)
#define __lifetime_capture_by(X) [[clang::lifetime_capture_by(X)]]
#else
#define __lifetime_capture_by(X) __attribute__((lifetime_capture_by(X)))
#endif

#if __use_cpp_spelling(clang::noescape)
#define __noescape [[clang::noescape]]
#else
#define __noescape __attribute__((noescape))
#endif

#endif /* __LIFETIMEBOUND_H */

* @return true if and only if the string is valid ASCII.
*/
simdutf_warn_unused bool validate_utf16_as_ascii(const char16_t *buf,
simdutf_warn_unused bool validate_utf16_as_ascii(const char16_t *__counted_by(len) buf __noescape,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like the __noescape could be expressed as a C++ attribute instead, can you use that instead?
https://clang.llvm.org/docs/AttributeReference.html#noescape

would that be useful outside of swift compatibility?

regarding the __counted_by, the only info I can find is that it is used on struct members. the gcc manual https://gcc.gnu.org/onlinedocs/gcc-15.2.0/gcc.pdf says:

In C++ this attribute is ignored

are there alternative ways of specifying the "counted by" property?

@MahdiBM
Copy link
Author

MahdiBM commented Jan 4, 2026

@pauldreik Thank you for the quick review.
Let me get back to you later, after I ask around about some of these topics from the Swift team folks.

@MahdiBM
Copy link
Author

MahdiBM commented Jan 4, 2026

@lemire
Copy link
Member

lemire commented Jan 4, 2026

@MahdiBM This seems like a great idea !!!

@lemire
Copy link
Member

lemire commented Jan 4, 2026

The relevant document appears to be here:

https://www.swift.org/documentation/cxx-interop/safe-interop/#annotating-c-apis

@lemire
Copy link
Member

lemire commented Jan 4, 2026

@MahdiBM and @pauldreik

Please see #896

@lemire lemire mentioned this pull request Jan 4, 2026
6 tasks
@lemire
Copy link
Member

lemire commented Jan 4, 2026

A C API might be simpler, no ?

#897

@lemire
Copy link
Member

lemire commented Jan 4, 2026

@MahdiBM Here is what I am thinking. If we have a C API, then we can surely generate a Swift wrapper with relatively little work. It can be done in a largely automated manner.

@lemire
Copy link
Member

lemire commented Jan 4, 2026

Scratch that. Swift can read a C header file and call C functions, can't it ?

https://github.com/lemire/SwiftCallingCHeader/blob/master/Sources/SomeSwift/main.swift

So it should be pretty much automated ?

@MahdiBM
Copy link
Author

MahdiBM commented Jan 4, 2026

Yes Swift interop with C/C++ is pretty seamless. I think we'd only need the modulemap and the Package.swift file.

I think optimally we'd have those, and if/when we add __noescape to the std::span arguments, then Swift should be able to synthesize those functions as Swift Spans as well, for free.

I did try to get std::span working with the Swift compiler but there must be a quirk of some sort which got in the way and failed the attempt.

@lemire
Copy link
Member

lemire commented Jan 4, 2026

@MahdiBM

Yes Swift interop with C/C++ is pretty seamless. I think we'd only need the modulemap and the Package.swift file.

Reading the documentation, we might need to annotate all of our std::span. This can be done in a portable way... see

#896

@MahdiBM
Copy link
Author

MahdiBM commented Jan 4, 2026

@lemire yes I did notice the PR, I think that's exactly what we need 🙂 assuming the Swift compiler is also cooperative, it should be the only extra change we need.

@pauldreik
Copy link
Collaborator

seems like the c api is the way to move forward on this!

@lemire
Copy link
Member

lemire commented Jan 5, 2026

@MahdiBM Are you able to test the C API and see if it works ?

If it solved a problem for Swift user, I think we could start officially supporting the C API.

@MahdiBM
Copy link
Author

MahdiBM commented Jan 5, 2026

@lemire I can imagine the C API being useful for some users, but it's not really needed for Swift users.

Swift has been supporting C++ interop since Swift 5.9 which was released 2-3 years ago. In the Swift OpenSource community we usually support past 3 minor versions of Swift (worth of 1.5 years), so 6.0, 6.1, 6.2, which all have the C++ interoperability.

Code-wise Swift only needs these:

    1. Package.swift
    1. module.modulemap

After that, Swift will synthesize functions via its family of UnsafePointer types.
At this point things will just work and we can just let things be.

Now here's the issue that I'm struggling to solve:
The UnsafePointer types are ... well ... unsafe, and Swift users are taught to avoid them when possible.
So preferably we can leverage Swift 6.2's Span types which are the same as std::span, but also non-escapable.

For that, the Swift team have introduced 2 ways to synthesize Swift Span types from C/C++ code.

1- If in C/C++ you have pointer+len parameters, annotate the pointer parameter with __noescape and __counted_by(the-len-param-name).

2- If in C++20 you have std::span, annotate it with __noescape.

In both of these 2 cases, Swift will automatically synthesize code that accepts Swift's Span type instead of unsafe pointers.

You show that you like the __noescape attribute anyway. The attribute makes sense and is not Swift specific. So that's good.

But the issue with "1" is that it requires __counted_by which is pretty Swift specific and as @pauldreik mentioned, will make the files even less readable.
I don't think this option is ruled out, but I'm trying to avoid it as best as I can, and go with the option "2".

The issue with "2" is ... well, I can't seem to get it working. I even tried to replicate a situation as close as possible to Swift's compiler tests for Span synthesis, verified the compiler flags are correctly passed, but with no luck. So I'm still trying to ask around, and I expect I'll have an answer from the Swift team at worst until the end of the week hopefully.

I'll let you know if there is some kind of not-work-around-able bug there and we need to choose another way, but for now, I'm hoping that I can get "2" working.

@MahdiBM
Copy link
Author

MahdiBM commented Jan 7, 2026

Ok @lemire @pauldreik . I found the culprit.

Let's just show the code upfront:

This works for accepting a Swift Span for the function call in Swift:

using char16_span = std::span<const char16_t>;

simdutf_really_inline simdutf_warn_unused simdutf_constexpr23 bool
validate_utf16_as_ascii(char16_span input __noescape) noexcept;

But this doesn't:

simdutf_really_inline simdutf_warn_unused simdutf_constexpr23 bool
validate_utf16_as_ascii(std::span<const char16_t> input __noescape) noexcept;

So ... yeah. For some reason Swift REQUIRES you to declare a "type alias" via using, and use that instead in the function parameter.

How does this change sound to you?
So overall will need these changes to make this work:

    1. Package.swift
    1. module.modulemap
    1. Add __noescape (or e.g. simd_noescape) to std::spans.
    1. Declare and use a "type alias" instead of just spelling out the type name, for std::spans.

(excluding CI and docs works)

I think the first 2 sound fine, and the 3rd one is meh/ok? what about the 4th one?

The name of the "type alias" doesn't matter. You can have any of these or anything else, but we'll need to do these everywhere std::span is used.

using char16_span = std::span<const char16_t>;
using const_char16_span = std::span<const char16_t>;
using std_span_of_const_char16_t = std::span<const char16_t>;
using span_const_char16 = std::span<const char16_t>;

Optionally we can first just don't do spans and just do point 1 and 2, + CI and some docs, and in another PR I can make the std::span changes.

@MahdiBM
Copy link
Author

MahdiBM commented Jan 7, 2026

See this as an example of how this works with my specific simdutf branch (not this branch, mmbm-swift-take-2):

https://github.com/MahdiBM/simdutf-swift-example

@lemire
Copy link
Member

lemire commented Jan 10, 2026

@MahdiBM But your demo does not build, right? At least, not portably.

$ cmake --build build
[  1%] Building CXX object src/CMakeFiles/simdutf.dir/simdutf.cpp.o
In file included from /Users/dlemire/tmp/simdutf/include/simdutf.h:15,
                 from /Users/dlemire/tmp/simdutf/src/simdutf.cpp:1:
/Users/dlemire/tmp/simdutf/include/simdutf/implementation.h:16:10: fatal error: ptrcheck.h: No such file or directory
   16 | #include <ptrcheck.h>
      |          ^~~~~~~~~~~~
compilation terminated.
gmake[2]: *** [src/CMakeFiles/simdutf.dir/build.make:79: src/CMakeFiles/simdutf.dir/simdutf.cpp.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:1385: src/CMakeFiles/simdutf.dir/all] Error 2
gmake: *** [Makefile:146: all] Error 2

It seems that #include <ptrcheck.h> is an LLVM specific header.

We target at least GCC, LLVM and MSVC C++ compilers.

I can imagine the C API being useful for some users, but it's not really needed for Swift users.

I am somewhat puzzled, what is wrong with the C API ? Doesn't it solve the issue without any need for intrusive changes in our library?

Swift has been supporting C++ interop since Swift 5.9 which was released 2-3 years ago. In the Swift OpenSource community we usually support past 3 minor versions of Swift (worth of 1.5 years), so 6.0, 6.1, 6.2, which all have the C++ interoperability. Code-wise Swift only needs these: Package.swift module.modulemap

By your own analysis, this statement does not appear to be correct. If I trust what you are writing and demoing, one needs to make substantial changes to the C++ itself, and these changes appear to be invasive.

I am not sure how the C++/Swift interop works but it seems to require substantial annotation, new headers, and we can't use std::span<char> and the like.

Consider that we need to maintain this on the long run and it seems that neither myself nor @pauldreik is well versed in this Swift/C++ interop.

Don't get me wrong, we very much want simdutf to be callable from Swift, but we also want it to be done in a way that requires little additional maintenance. Modifying significant our code in ways that are not obvious does not bode well.

I believe that the C API is much more desirable.

@MahdiBM
Copy link
Author

MahdiBM commented Jan 10, 2026

@lemire

But your demo does not build, right? At least, not portably.

Emm yeah I only tried the build via Swift, so I probably just missed that. In anycase I don't think that's a blocking issue; we can just put it behind some compiler flags or such even if it's needed (<ptrcheck> specifically is probably not even needed).

I am somewhat puzzled, what is wrong with the C API?

Nothing really. Just that we already had C++.

Doesn't it solve the issue without any need for intrusive changes in our library?
...
I believe that the C API is much more desirable.

I am also confused. Why do you think it does solve the issue? What were the issues?

If we can add __noescape and __counted_by annotations to the C API, then yes. Otherwise I don't see a difference between using the C++ API or the C API.

Don't get me wrong, we very much want simdutf to be callable from Swift, but we also want it to be done in a way that requires little additional maintenance. Modifying significant our code in ways that are not obvious does not bode well.

No worries, I understand. That's why I'm pushing for Swift to be more smart about synthesizing Swift Spans out of C++ std::span so it requires less changes on the C++ side: swiftlang/swift#86339 (comment)

Consider that we need to maintain this on the long run and it seems that neither myself nor @pauldreik is well versed in this Swift/C++ interop.

That's a legitimate concern, and It'll be one more thing to take care of I guess.

I guess as a Swift developer what I can say is that there is not really that much to the C++ interop. It's simply just compiler magic.
And there are some Apple WWDC videos or documents that do explain what are the stuff that a user can modify at all to affect the compiler decisions.
In the end though, it is still one more thing to take care of so it is some added maintenance.

I think I can set up some proper Swift CI together, and that should cover a lot of the concerns.
We'll be simply calling the C++ APIs with some Swift Spans, just to ensure the API calls do compile / run.
For new APIs, you simply only have to copy paste some lines and change the function name in the Swift tests.

If there is an issue, you can:

  • Ask me, I'm not going anywhere.
  • Assuming I end up going unresponsive, you can still ask somewhere like in the Swift forums where the language maintainers are available.
  • Worst case, you can just skip the Swift compatibility for the new APIs as long as you can get Swift to at least keep building what was already building.

@lemire
Copy link
Member

lemire commented Jan 10, 2026

@MahdiBM

The C API should solve the issue because it makes it easy to write a Swift library that acts as a wrapper. Calling C from Swift is easy and has no performance overhead so it is ideal.

This is no difficult work. One can probably get it done in an hour or so. One just needs some Swift boilerplate code.

@MahdiBM
Copy link
Author

MahdiBM commented Jan 10, 2026

The C API should solve the issue because it makes it easy to write a Swift library that acts as a wrapper. Calling C from Swift is easy and has no performance overhead so it is ideal.

Ok but I'm struggling to see which part of this is not true about C++ 🤔. IIUC its the same for C++ modulo the Swift versions required. Or am I missing something?

@lemire
Copy link
Member

lemire commented Jan 10, 2026

Ok but I'm struggling to see which part of this is not true about C++ 🤔. IIUC its the same for C++ modulo the Swift versions required. Or am I missing something?

The C API is available now and it did not require any change to the library. One can use it now to write a wrapper. It works.

What you are proposing is that we start maintaining invasive Swift-specific code changes inside simdutf, it is undesirable.

Let be clear: this is not happening. I explained why above in clear terms.

It would be poor engineering.

I consider that adding the C interface resolves the issue in a satisfactory manner.

@lemire lemire closed this Jan 10, 2026
@simdutf simdutf locked as resolved and limited conversation to collaborators Jan 10, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants