Skip to content

Thread Safety Analysis: Fix crash with wide string literals#180349

Merged
melver merged 1 commit intollvm:mainfrom
melver:for-pull2
Feb 10, 2026
Merged

Thread Safety Analysis: Fix crash with wide string literals#180349
melver merged 1 commit intollvm:mainfrom
melver:for-pull2

Conversation

@melver
Copy link
Contributor

@melver melver commented Feb 7, 2026

The SExprBuilder was previously using StringLiteral::getString() to extract the value of string literals. This method asserts that the string is a narrow string (char width == 1):

clang/include/clang/AST/Expr.h:1872: StringRef clang::StringLiteral::getString() const: Assertion `(isUnevaluated() || getCharByteWidth() == 1) && "This function is used in places that assume strings use char"' failed.
[...]
 #9 0x0000556247fcfe3e clang::threadSafety::SExprBuilder::translate(clang::Stmt const*, clang::threadSafety::SExprBuilder::CallingContext*)
[...]

This fails when using wide string literals as in the new test case added.

Switch to using StringLiteral::getBytes(), which provides the raw byte representation of the string. This is sufficient to compare expressions for identity.

Fixes: #148551

The SExprBuilder was previously using StringLiteral::getString() to
extract the value of string literals. This method asserts that the
string is a narrow string (char width == 1):

```
clang/include/clang/AST/Expr.h:1872: StringRef clang::StringLiteral::getString() const: Assertion `(isUnevaluated() || getCharByteWidth() == 1) && "This function is used in places that assume strings use char"' failed.
[...]
 llvm#9 0x0000556247fcfe3e clang::threadSafety::SExprBuilder::translate(clang::Stmt const*, clang::threadSafety::SExprBuilder::CallingContext*)
[...]
```

This fails when using wide string literals as in the new test case
added.

Switch to using StringLiteral::getBytes(), which provides the raw byte
representation of the string. This is sufficient to compare expressions
for identity.

Fixes: llvm#148551
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:analysis labels Feb 7, 2026
@llvmbot
Copy link
Member

llvmbot commented Feb 7, 2026

@llvm/pr-subscribers-clang-analysis

@llvm/pr-subscribers-clang

Author: Marco Elver (melver)

Changes

The SExprBuilder was previously using StringLiteral::getString() to extract the value of string literals. This method asserts that the string is a narrow string (char width == 1):

clang/include/clang/AST/Expr.h:1872: StringRef clang::StringLiteral::getString() const: Assertion `(isUnevaluated() || getCharByteWidth() == 1) && "This function is used in places that assume strings use char"' failed.
[...]
 #<!-- -->9 0x0000556247fcfe3e clang::threadSafety::SExprBuilder::translate(clang::Stmt const*, clang::threadSafety::SExprBuilder::CallingContext*)
[...]

This fails when using wide string literals as in the new test case added.

Switch to using StringLiteral::getBytes(), which provides the raw byte representation of the string. This is sufficient to compare expressions for identity.

Fixes: #148551


Full diff: https://github.com/llvm/llvm-project/pull/180349.diff

2 Files Affected:

  • (modified) clang/lib/Analysis/ThreadSafetyCommon.cpp (+2-2)
  • (modified) clang/test/SemaCXX/warn-thread-safety-analysis.cpp (+36)
diff --git a/clang/lib/Analysis/ThreadSafetyCommon.cpp b/clang/lib/Analysis/ThreadSafetyCommon.cpp
index 8180265cc5ed0..b43a986521f99 100644
--- a/clang/lib/Analysis/ThreadSafetyCommon.cpp
+++ b/clang/lib/Analysis/ThreadSafetyCommon.cpp
@@ -356,10 +356,10 @@ til::SExpr *SExprBuilder::translate(const Stmt *S, CallingContext *Ctx) {
       llvm_unreachable("Invalid integer type");
   }
   case Stmt::StringLiteralClass:
-    return new (Arena) til::LiteralT(cast<StringLiteral>(S)->getString());
+    return new (Arena) til::LiteralT(cast<StringLiteral>(S)->getBytes());
   case Stmt::ObjCStringLiteralClass:
     return new (Arena)
-        til::LiteralT(cast<ObjCStringLiteral>(S)->getString()->getString());
+        til::LiteralT(cast<ObjCStringLiteral>(S)->getString()->getBytes());
 
   case Stmt::DeclStmtClass:
     return translateDeclStmt(cast<DeclStmt>(S), Ctx);
diff --git a/clang/test/SemaCXX/warn-thread-safety-analysis.cpp b/clang/test/SemaCXX/warn-thread-safety-analysis.cpp
index e57299e93aa48..30d18517c9a19 100644
--- a/clang/test/SemaCXX/warn-thread-safety-analysis.cpp
+++ b/clang/test/SemaCXX/warn-thread-safety-analysis.cpp
@@ -7770,3 +7770,39 @@ void testLoopConditionalReassignment(Foo *f1, Foo *f2, bool cond) {
   ptr->mu.Unlock(); // expected-warning{{releasing mutex 'ptr->mu' that was not held}}
 } // expected-warning{{mutex 'f1->mu' is still held at the end of function}}
 }  // namespace CapabilityAliases
+
+namespace WideStringLiteral {
+
+class Foo {
+ public:
+  Mutex mu;
+  Mutex* getMu(const wchar_t* s) { return &mu; }
+  Mutex* getMu2(const char16_t* s) { return &mu; }
+  Mutex* getMu3(const char32_t* s) { return &mu; }
+
+  int a GUARDED_BY(getMu(L"abc"));
+  int b GUARDED_BY(getMu2(u"abc"));
+  int c GUARDED_BY(getMu3(U"abc"));
+};
+
+Foo g_foo;
+
+void test() {
+  g_foo.getMu(L"abc")->Lock();
+  g_foo.a = 0;
+  g_foo.getMu(L"abc")->Unlock();
+}
+
+void test2() {
+  g_foo.getMu2(u"abc")->Lock();
+  g_foo.b = 0;
+  g_foo.getMu2(u"abc")->Unlock();
+}
+
+void test3() {
+  g_foo.getMu3(U"abc")->Lock();
+  g_foo.c = 0;
+  g_foo.getMu3(U"abc")->Unlock();
+}
+
+} // namespace WideStringLiteral

@melver
Copy link
Contributor Author

melver commented Feb 10, 2026

ping

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have a release note so users know about the fix.

One thing I'm concerned by is that we have compareString functions which I don't think will expect this because those bytes may not be comparable. e.g., we do this:

  case ValueType::BT_String:
    return Cmp.compareStrings(as<StringRef>().value(),
                              E->as<StringRef>().value());

and is that actually going to work when we don't have a StringRef but instead have a bucket of bytes?

@melver
Copy link
Contributor Author

melver commented Feb 10, 2026

This should have a release note so users know about the fix.

We don't have a release with #148551 yet, which was landed last week. We might need a separate release note for #148551 itself though, but that's a separate PR.

One thing I'm concerned by is that we have compareString functions which I don't think will expect this because those bytes may not be comparable. e.g., we do this:

  case ValueType::BT_String:
    return Cmp.compareStrings(as<StringRef>().value(),
                              E->as<StringRef>().value());

and is that actually going to work when we don't have a StringRef but instead have a bucket of bytes?

getBytes() returns a StringRef, and we have:

  bool compareStrings (StringRef s, StringRef r) { return s == r; }

So it will just compare these bytes for equality (StringRef::operator== does memcmp). So if we have 2 wide-strings that are equal, they will match. I think that's the intent.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thank you for the explanations, then this LGTM and no release note needed.

@melver melver merged commit e318fc8 into llvm:main Feb 10, 2026
13 checks passed
@aaronpuchert
Copy link
Member

Thanks, and I agree that this seems to be the right fix!

manasij7479 pushed a commit to manasij7479/llvm-project that referenced this pull request Feb 18, 2026
)

The SExprBuilder was previously using StringLiteral::getString() to
extract the value of string literals. This method asserts that the
string is a narrow string (char width == 1):

```
clang/include/clang/AST/Expr.h:1872: StringRef clang::StringLiteral::getString() const: Assertion `(isUnevaluated() || getCharByteWidth() == 1) && "This function is used in places that assume strings use char"' failed.
[...]
 llvm#9 0x0000556247fcfe3e clang::threadSafety::SExprBuilder::translate(clang::Stmt const*, clang::threadSafety::SExprBuilder::CallingContext*)
[...]
```

This fails when using wide string literals as in the new test case
added.

Switch to using StringLiteral::getBytes(), which provides the raw byte
representation of the string. This is sufficient to compare expressions
for identity.

Fixes: llvm#148551
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:analysis clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants