Thread Safety Analysis: Fix crash with wide string literals#180349
Thread Safety Analysis: Fix crash with wide string literals#180349
Conversation
The SExprBuilder was previously using StringLiteral::getString() to extract the value of string literals. This method asserts that the string is a narrow string (char width == 1): ``` clang/include/clang/AST/Expr.h:1872: StringRef clang::StringLiteral::getString() const: Assertion `(isUnevaluated() || getCharByteWidth() == 1) && "This function is used in places that assume strings use char"' failed. [...] llvm#9 0x0000556247fcfe3e clang::threadSafety::SExprBuilder::translate(clang::Stmt const*, clang::threadSafety::SExprBuilder::CallingContext*) [...] ``` This fails when using wide string literals as in the new test case added. Switch to using StringLiteral::getBytes(), which provides the raw byte representation of the string. This is sufficient to compare expressions for identity. Fixes: llvm#148551
|
@llvm/pr-subscribers-clang-analysis @llvm/pr-subscribers-clang Author: Marco Elver (melver) ChangesThe SExprBuilder was previously using StringLiteral::getString() to extract the value of string literals. This method asserts that the string is a narrow string (char width == 1): This fails when using wide string literals as in the new test case added. Switch to using StringLiteral::getBytes(), which provides the raw byte representation of the string. This is sufficient to compare expressions for identity. Fixes: #148551 Full diff: https://github.com/llvm/llvm-project/pull/180349.diff 2 Files Affected:
diff --git a/clang/lib/Analysis/ThreadSafetyCommon.cpp b/clang/lib/Analysis/ThreadSafetyCommon.cpp
index 8180265cc5ed0..b43a986521f99 100644
--- a/clang/lib/Analysis/ThreadSafetyCommon.cpp
+++ b/clang/lib/Analysis/ThreadSafetyCommon.cpp
@@ -356,10 +356,10 @@ til::SExpr *SExprBuilder::translate(const Stmt *S, CallingContext *Ctx) {
llvm_unreachable("Invalid integer type");
}
case Stmt::StringLiteralClass:
- return new (Arena) til::LiteralT(cast<StringLiteral>(S)->getString());
+ return new (Arena) til::LiteralT(cast<StringLiteral>(S)->getBytes());
case Stmt::ObjCStringLiteralClass:
return new (Arena)
- til::LiteralT(cast<ObjCStringLiteral>(S)->getString()->getString());
+ til::LiteralT(cast<ObjCStringLiteral>(S)->getString()->getBytes());
case Stmt::DeclStmtClass:
return translateDeclStmt(cast<DeclStmt>(S), Ctx);
diff --git a/clang/test/SemaCXX/warn-thread-safety-analysis.cpp b/clang/test/SemaCXX/warn-thread-safety-analysis.cpp
index e57299e93aa48..30d18517c9a19 100644
--- a/clang/test/SemaCXX/warn-thread-safety-analysis.cpp
+++ b/clang/test/SemaCXX/warn-thread-safety-analysis.cpp
@@ -7770,3 +7770,39 @@ void testLoopConditionalReassignment(Foo *f1, Foo *f2, bool cond) {
ptr->mu.Unlock(); // expected-warning{{releasing mutex 'ptr->mu' that was not held}}
} // expected-warning{{mutex 'f1->mu' is still held at the end of function}}
} // namespace CapabilityAliases
+
+namespace WideStringLiteral {
+
+class Foo {
+ public:
+ Mutex mu;
+ Mutex* getMu(const wchar_t* s) { return μ }
+ Mutex* getMu2(const char16_t* s) { return μ }
+ Mutex* getMu3(const char32_t* s) { return μ }
+
+ int a GUARDED_BY(getMu(L"abc"));
+ int b GUARDED_BY(getMu2(u"abc"));
+ int c GUARDED_BY(getMu3(U"abc"));
+};
+
+Foo g_foo;
+
+void test() {
+ g_foo.getMu(L"abc")->Lock();
+ g_foo.a = 0;
+ g_foo.getMu(L"abc")->Unlock();
+}
+
+void test2() {
+ g_foo.getMu2(u"abc")->Lock();
+ g_foo.b = 0;
+ g_foo.getMu2(u"abc")->Unlock();
+}
+
+void test3() {
+ g_foo.getMu3(U"abc")->Lock();
+ g_foo.c = 0;
+ g_foo.getMu3(U"abc")->Unlock();
+}
+
+} // namespace WideStringLiteral
|
|
ping |
AaronBallman
left a comment
There was a problem hiding this comment.
This should have a release note so users know about the fix.
One thing I'm concerned by is that we have compareString functions which I don't think will expect this because those bytes may not be comparable. e.g., we do this:
case ValueType::BT_String:
return Cmp.compareStrings(as<StringRef>().value(),
E->as<StringRef>().value());
and is that actually going to work when we don't have a StringRef but instead have a bucket of bytes?
We don't have a release with #148551 yet, which was landed last week. We might need a separate release note for #148551 itself though, but that's a separate PR.
getBytes() returns a StringRef, and we have: So it will just compare these bytes for equality (StringRef::operator== does memcmp). So if we have 2 wide-strings that are equal, they will match. I think that's the intent. |
AaronBallman
left a comment
There was a problem hiding this comment.
Ah thank you for the explanations, then this LGTM and no release note needed.
|
Thanks, and I agree that this seems to be the right fix! |
) The SExprBuilder was previously using StringLiteral::getString() to extract the value of string literals. This method asserts that the string is a narrow string (char width == 1): ``` clang/include/clang/AST/Expr.h:1872: StringRef clang::StringLiteral::getString() const: Assertion `(isUnevaluated() || getCharByteWidth() == 1) && "This function is used in places that assume strings use char"' failed. [...] llvm#9 0x0000556247fcfe3e clang::threadSafety::SExprBuilder::translate(clang::Stmt const*, clang::threadSafety::SExprBuilder::CallingContext*) [...] ``` This fails when using wide string literals as in the new test case added. Switch to using StringLiteral::getBytes(), which provides the raw byte representation of the string. This is sufficient to compare expressions for identity. Fixes: llvm#148551
The SExprBuilder was previously using StringLiteral::getString() to extract the value of string literals. This method asserts that the string is a narrow string (char width == 1):
This fails when using wide string literals as in the new test case added.
Switch to using StringLiteral::getBytes(), which provides the raw byte representation of the string. This is sufficient to compare expressions for identity.
Fixes: #148551