Skip to content

Enable TLS Support for C/C++#111

Merged
ljsebald merged 37 commits intoKallistiOS:masterfrom
cepawiel:tls-support
Sep 16, 2023
Merged

Enable TLS Support for C/C++#111
ljsebald merged 37 commits intoKallistiOS:masterfrom
cepawiel:tls-support

Conversation

@gyrovorbis
Copy link
Member

@gyrovorbis gyrovorbis commented Feb 22, 2023

This is the work required to support static TLS for KOS and in the toolchains per the SH TLS ABI is variant I.

  • Remove disabled TLS in the toolchain build scripts
  • Add "tcbhead" intrusive structure to KOS's thread structure, which is what's stored within GBR
  • Allocating extra chunk of memory for "tcbhead" followed by .tdata and .tbss segements
  • Set static TLS block data based on .tdata values for initialized data, 0 for .tbss data
  • Added TLS-specific section exports to the linker script for access within KOS
  • Created example flexing TLS using thread-local variables requiring both .tbss and .tdata initialization
  • Tested with GCC12.2.0
  • Tested with GCC9.3.0
  • Tested with GCC4.7.0
  • Tested with GCC13.1.0
  • TLS variables working properly for the main thread
  • Manually aligned TLS variables working properly

This implementation still provides a void *dtv field which is NULL initialized for now, which can later be used to add support for non-static dynamic module TLS in the future as well.

This PR also adds an example of using TLS storage that also doubles as a test suite to flex the hell out of this implementation and return EXIT_SUCCES/EXIT_FAILURE based on the results (which we can then use in CI/automation once we get return values in DC-Tool).

@gyrovorbis gyrovorbis added the enhancement Enhancing an existing feature label Feb 22, 2023
Added an include causing the toolchain to fail to build
Fixed a copy/paste error in the example Makefile
Changed example to use printf
Added cast to GBR pointer to remove warning
@cepawiel
Copy link
Contributor

This should be in a working state if anyone wants to help test. I've tested it working on all three (GCC4.7, GCC9, GCC12) toolchains. While I can't test on NAOMI (yet) I believe the changes to the linker script are pretty sane and copy/pasted from the tested Dreamcast one and should work.

Ensured Newlib's C and C++ stdlibs are now using TLS gracefully for reentrancy

I think any Newlib changes should be investigated in a future PR instead of this one, as Newlib support for TLS replacing the reent struct is relatively new. It was added in this commit, and would only be possible with the GCC12 toolchain.

@gyrovorbis
Copy link
Member Author

gyrovorbis commented Feb 25, 2023

Ensured Newlib's C and C++ stdlibs are now using TLS gracefully for reentrancy

I think any Newlib changes should be investigated in a future PR instead of this one, as Newlib support for TLS replacing the reent struct is relatively new. It was added in this commit, and would only be possible with the GCC12 toolchain.

Yeah, after talking to you, I agree. Updated the main comment to remove that but also added the diligent toolchain testing you've done too. When I get a chance, I'll help do some testing on the C++ side.

@gyrovorbis gyrovorbis added the toolchain Toolchain or Compiler-related Issues label Mar 2, 2023
@gyrovorbis
Copy link
Member Author

If anyone has anything they'd like to add or see, this PR is basically ready other than a bit more testing, which I'm holding off on until I finish writing a bunch of unit tests for libGimbal's concurrency stuff, which is about to flex this all pretty hard!

@gyrovorbis
Copy link
Member Author

gyrovorbis commented Apr 21, 2023

Just wanted to leave an update for myself and anyone else following along, because I'm about to have a big move and will be changing jobs, and have been investigating/testing this quite a bit to continue our sacred quest for TLS support.

I've been really diligent about testing and researching good test scenarios for TLS with libGimbal and have managed to break our initial implementation in two different ways:

  • TLS accesses for the main thread are incorrect
  • TLS accesses for manually overaligned values are incorrect

I'm not quite sure what's causing the former, but I'm fairly certain it's probably something trivial and dumb (are we not setting GBR for the main thread)? The latter issue; however, is going to be quite a bit more in-depth...

Luckily it looks like supporting odd alignments in TLS is a common issue that is run into in other indie/retro SDKs, and I've actually managed to find a really good reference implementation for addressing the same issue in the Nintendo 3DS DevkitPro SDK: devkitPro/libctru#497 (thank you from the Dreamcast scene, my dudes). We can most likely follow along with these breadcrumbs and implement this fix similarly, ensuring proper alignment for the TLS fields of the whole outer thread struct.

Anyway, I'll be back to working on this ASAP, because I need it for my own engine and concurrency model. Just wanted to leave these notes for myself and anyone else in the meantime.

EDIT: For reproducing/troubleshooting these two issues, the little TLS example can be modified to also use the counter on the main thread, which will immediately return/print the wrong value.

Then in my own libGimbal tests, I was flexing random alignments with the following, which were winding up all kinds of screwed up:

static _Thread_local              uint16_t      tlsUint16_      = '\0';
static _Thread_local _Alignas(16) int32_t       tlsInt32_       = -346;
static _Thread_local              GblStringRef* tlsStringRef_   = NULL;
static _Thread_local _Alignas(32) char          tlsCharArray_[] = { "abcdefghijklmnopqrstuvwxyz012345" };
static _Thread_local _Alignas(64) GblObject     tlsObject_;
static _Thread_local _Alignas(8)  double        tlsDouble_      = 12345.6789;

My WIP unit tests for flexing this are located here: https://github.com/gyrovorbis/libgimbal/blob/master/test/source/core/gimbal_thread_test_suite.c

- Added coverage for main thread's TLS data getting flexed
- Added arbitary alignments for flexing tdata and tbss alignment
@gyrovorbis
Copy link
Member Author

gyrovorbis commented Apr 27, 2023

I have fantastic news to report on the quest for TLS... I CANNOT REPRODUCE THE PROBLEMS I FOUND ON GCC13! I'm not sure what the heck is going on or if 12 had issues or what the deal was, but I've modified your example to flex the two things I was mentioning 1) main thread using TLS data 2) weirdly aligned TLS data, and it all works fine.

I am, however, still facing one last issue with my libGimbal codebase using TLS... Every time I try to link to a statically linked library that uses TLS variables that get referenced from the main program, the compiler emits __emu_tls_() symbols, which seem to totally break the code in the final binary... I'm trying to play with toolchain configurations and global build flags (stuff like -fno-pic and -fno-pie) to see if I can get it to always emit GBR offset accesses to TLS variables.

Btw, is there any way you can give me permission to push back to this branch? I can add my changes to your example flexing the other two issues so you can verify they're working for you too.

cepawiel added 5 commits May 8, 2023 18:20
Added an include causing the toolchain to fail to build
Fixed a copy/paste error in the example Makefile
Changed example to use printf
Added cast to GBR pointer to remove warning
GBR register needed to be initialized after setting up the thread
control block for the main thread.
@cepawiel
Copy link
Contributor

cepawiel commented May 9, 2023

I was able to replicate the issue with using TLS on the main thread. I just wasn't setting the GBR register for the main thread so it was not accessing the correct memory region when attempting to access TLS variables. This was only an issue for the main thread as it's the only thread that sets up its own thread structure, all others will have the GBR loaded during a context switch.

@gyrovorbis gyrovorbis added this to the v2.1.0 milestone Jun 5, 2023
- added TLS exec model (local) to CMake toolchain
- added .tdata and .tbss alignment variables to .rodata in DC linker
  script
- made KOS threads *attempt* to create TLS storage with proper alignment
  (still WIP)
- updated compiler_tls example to reproduce issues with aligned TLS that
  still need to be resolved
- Finally figured out how to align the segments properly based on
  ELF TLS data segment information given through the DC linker script
- Finished and fleshed out the compiler_tls test to reproduce all of the
  issues and serve as a potentially automatable test suite.
- Added the same update to the NAOMI linker script that the DC got for
  .TDATA and .TBSS alignment
- Updated CHANGELOG with Colton and my TLS work
@gyrovorbis
Copy link
Member Author

gyrovorbis commented Aug 20, 2023

It has been a long long long time coming, but we got TLS working!!!

It has been validated with both the new compiler_tls example from Colton and my MASSIVE libGimbal unit test suite. Here's the threading tests I put KOS through that all passed:
https://github.com/gyrovorbis/libgimbal/blob/master/test/source/core/gimbal_thread_test_suite.c

Between the two, I feel like we've thrown everything at the TLS implementation that I can think of... she's finally ready! Really excited about this one!

@gyrovorbis gyrovorbis marked this pull request as ready for review August 20, 2023 07:58
Holy crap, the optimizer in non-release builds will totally optimize away
zero-size checks, because the source of the check is the memory
address of a linker-script exported variable... and why would the
compiler think an address could be NULL? Had to make sizes volatile to
circumvent this in release builds.
@gyrovorbis
Copy link
Member Author

Screenshot from 2023-09-15 20-53-04
Review feedback addressed, rebuilt KOS along with the example and nothing has broken (thank god). :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Enhancing an existing feature toolchain Toolchain or Compiler-related Issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants