Skip to content

C and C++ should not be labelled as GNU dialects#178

Closed
TheShermanTanker wants to merge 4 commits intoeclipse-cdt:mainfrom
TheShermanTanker:patch-1
Closed

C and C++ should not be labelled as GNU dialects#178
TheShermanTanker wants to merge 4 commits intoeclipse-cdt:mainfrom
TheShermanTanker:patch-1

Conversation

@TheShermanTanker
Copy link
Copy Markdown
Contributor

@TheShermanTanker TheShermanTanker commented Nov 24, 2022

C and C++ are currently labelled as "GNU C" and "GNU C++" in the explorer when choosing what options to set for any specified language (For example, in the Preprocessor Include Paths, Macros etc screen). Not only is this confusing since it implies that Eclipse by default uses the GNU dialects of the languages (which is not true), this is also flat out wrong in other cases, such as when Workspaces configured with the Microsoft Visual C++ compiler (one that couldn't possibly be more incompatible with the actual GNU dialect) are also labelled as "GNU C". This was likely due to Eclipse only really supporting gcc when it started out, but this is no longer the case with the IDE becoming more and more flexible over the years, and it's probably time to ditch the GNU labels in the hardcoded language names.

@jonahgraham
Copy link
Copy Markdown
Member

Thanks @TheShermanTanker for the Pull Request.

I am not sure of all places that is used, so I will need to review it a little first.

In the meantime, can you sign the ECA please: https://github.com/eclipse-cdt/cdt/blob/main/CONTRIBUTING.md#eclipse-contributor-agreement

@TheShermanTanker
Copy link
Copy Markdown
Contributor Author

Thanks @TheShermanTanker for the Pull Request.

I am not sure of all places that is used, so I will need to review it a little first.

In the meantime, can you sign the ECA please: https://github.com/eclipse-cdt/cdt/blob/main/CONTRIBUTING.md#eclipse-contributor-agreement

Thanks, I did sign it earlier actually, but the bot doesn't seem to be able to resolve the ECA. I'm guessing there's a bit of a delay involved?

@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 24, 2022

Test Results

     594 files       594 suites   16m 27s ⏱️
10 127 tests 10 105 ✔️ 22 💤 0
10 143 runs  10 121 ✔️ 22 💤 0

Results for commit df9c432.

♻️ This comment has been updated with latest results.

@jonahgraham
Copy link
Copy Markdown
Member

Test Results

     594 files     594 suites   15m 50s stopwatch 10 127 tests 8 588 heavy_check_mark 22 zzz 1 517 x 10 143 runs  8 599 heavy_check_mark 22 zzz 1 522 x

For more details on these failures, see this check.

Results for commit 1d8cdc4.

Hmm - is there really something that depends on that string in the code/tests? I have retriggered tests to make sure we aren't getting a false failure.

@jonahgraham
Copy link
Copy Markdown
Member

Thanks, I did sign it earlier actually, but the bot doesn't seem to be able to resolve the ECA. I'm guessing there's a bit of a delay involved?

Seems that way - revalidating worked.

@jonahgraham
Copy link
Copy Markdown
Member

This was likely due to Eclipse only really supporting gcc when it started out

FWIW non-GCC has always been an important part of the CDT ecosystem, from the very early days. Has no one ever noticed/commented or bothered to fix this before? Strange.

@jonahgraham
Copy link
Copy Markdown
Member

There is indeed tests that depend on the name as is, e.g.:

The test output is this:

 <cdtprojectproperties>
 <section name="org.eclipse.cdt.internal.ui.wizards.settingswizards.IncludePaths">
 <language name="GNU C++">
 <includepath>C:\WINDOWS</includepath><invalid></invalid>
 </language>
 </section>
 </cdtprojectproperties>
junit.framework.AssertionFailedError: 
 <cdtprojectproperties>
 <section name="org.eclipse.cdt.internal.ui.wizards.settingswizards.IncludePaths">
 <language name="GNU C++">
 <includepath>C:\WINDOWS</includepath><invalid></invalid>
 </language>
 </section>
 </cdtprojectproperties>

	at org.eclipse.cdt.ui.tests.wizards.settingswizards.SettingsImportExportTest.vaidateCorrectErrorHandling(SettingsImportExportTest.java:191)
	at org.eclipse.cdt.ui.tests.wizards.settingswizards.SettingsImportExportTest.testNotValid(SettingsImportExportTest.java:284)

which doesn't really help much.

The test is really old, and I suspect not relevant as it refers to the old format for CDT projects.

The other tests all fail because the above test doesn't clean up after itself when it fails and leaves the workspace in a bad state.

@TheShermanTanker
Copy link
Copy Markdown
Contributor Author

TheShermanTanker commented Nov 25, 2022

The test might be old, but I don't feel confident enough to do anything with it. I've instead just refactored it to expect the new language name to minimize the amount of changes I'm making

@jonahgraham
Copy link
Copy Markdown
Member

As it turns out CDT 11 is a good time to make this change because in Bug 577074 commit fa8a627 I changed the import and export wizard to use the language ID instead of the name of the language. This change will mean people who have exported with CDT < 10.5 won't be able to import anymore. Therefore an entry in the N&N is needed to guide them.

It would be great if you updated the New and Noteworthy, but if you aren't able to by Monday I will do it before I submit the final build for CDT 11.

That said, I haven't actually completed my review to make sure there are no other side-effects of this name change. (The joys of working with old code bases, even simple thing like changing a display string is a non-trivial effort.)

@jonahgraham jonahgraham added this to the 11.0.0 milestone Nov 25, 2022
@jonahgraham jonahgraham added the noteworthy Pull requests and fixed issues that should be highlighted to users label Nov 25, 2022
@TheShermanTanker
Copy link
Copy Markdown
Contributor Author

As it turns out CDT 11 is a good time to make this change because in Bug 577074 commit fa8a627 I changed the import and export wizard to use the language ID instead of the name of the language. This change will mean people who have exported with CDT < 10.5 won't be able to import anymore. Therefore an entry in the N&N is needed to guide them.

It would be great if you updated the New and Noteworthy, but if you aren't able to by Monday I will do it before I submit the final build for CDT 11.

That said, I haven't actually completed my review to make sure there are no other side-effects of this name change. (The joys of working with old code bases, even simple thing like changing a display string is a non-trivial effort.)

Something like this?
C and C++ have been internally renamed from "GNU C" and "GNU C++" to just "C" and "C++". You may need to take note of this change when using the import/export wizard
Which section of "Noteworthy" would you like this to go under?

@jonahgraham
Copy link
Copy Markdown
Member

Something like this?
C and C++ have been internally renamed from "GNU C" and "GNU C++" to just "C" and "C++". You may need to take note of this change when using the import/export wizard

That looks good, some small mods, plus lets be explicit about versions.

C and C++ have been renamed from "GNU C" and "GNU C++" to just "C" and "C++". If you have exported settings with CDT prior to CDT 10.5 they may not import successfully with CDT 11. See #178

Which section of "Noteworthy" would you like this to go under?

It can go in the "release notes" section with heading "GNU removed from display name language for C/C++"

@jonahgraham
Copy link
Copy Markdown
Member

I have taken some time to look at this in the code. It is not so straightforward to rename this. This is because the name really is the name for GNU C and GNU C++. It is just typical that only GNU C and GNU C++ is installed, especially if you get plain CDT. Those names assign languages to the specific language implementations.

this is also flat out wrong in other cases, such as when Workspaces configured with the Microsoft Visual C++ compiler (one that couldn't possibly be more incompatible with the actual GNU dialect) are also labelled as "GNU C".

However CDT is treating those C and C++ files as GNU C and GNU C++ files. i.e. standard CDT won't handle MSVC extensions (generally) because no one has contributed a subclass of AbstractCLikeLanguage for MSVC. Therefore CDT will use the GNU version.

Vendors (like ARM, Renesas, probably IAR too) do contribute their own language definitions with their own names. When you create one of those vendor C/C++ projects you get a project where the language mappings are from org.eclipse.cdt.core.cSource -> vendor.c.language id/name.

So, I don't think this change can be accepted as is.

Which toolset are you using? If you are using the MSVC integration that comes with CDT, then it needs that additional work to mark the language mappings as MSVC C and C++.

Am I understanding the problem correctly? Please feel free to share your thoughts on this.

BTW are you an end user or a vendor integrating CDT?

@TheShermanTanker
Copy link
Copy Markdown
Contributor Author

I have taken some time to look at this in the code. It is not so straightforward to rename this. This is because the name really is the name for GNU C and GNU C++. It is just typical that only GNU C and GNU C++ is installed, especially if you get plain CDT. Those names assign languages to the specific language implementations.

Vendors (like ARM, Renesas, probably IAR too) do contribute their own language definitions with their own names. When you create one of those vendor C/C++ projects you get a project where the language mappings are from org.eclipse.cdt.core.cSource -> vendor.c.language id/name.

So, I don't think this change can be accepted as is.

Which toolset are you using? If you are using the MSVC integration that comes with CDT, then it needs that additional work to mark the language mappings as MSVC C and C++.

Am I understanding the problem correctly? Please feel free to share your thoughts on this.

Ah, seems like it's been hardcoded from the start. That's unfortunate.

I typically use the default CDT setup, which was where I first noticed that while switching to the Visual C++ integration (as a test for something unrelated) that the language name would erroneously remain as "GNU C/C++", and assumed it was a visual error that could be rather easily fixed with a quick config switch. Seems like this isn't the case (I didn't realize that the language name was actually used as the marker for choosing the language implementation), so I apologize for the noise. That aside, seems a little strange to me that the setting for the language name is language.name.gcc in what I assume is the core part of CDT however.

this is also flat out wrong in other cases, such as when Workspaces configured with the Microsoft Visual C++ compiler (one that couldn't possibly be more incompatible with the actual GNU dialect) are also labelled as "GNU C".

However CDT is treating those C and C++ files as GNU C and GNU C++ files. i.e. standard CDT won't handle MSVC extensions (generally) because no one has contributed a subclass of AbstractCLikeLanguage for MSVC. Therefore CDT will use the GNU version.

Speaking of Visual C++, CDT has handled quite a number of its extensions quite well actually, so I'm surprised to hear that support for it is spotty at best

Independently of this issue though, it might be a good idea to also move assigning of language backends to also use the language id instead of the display name as has already been done with fa8a627, and I'd be happy to help do that, as well as fixing the display name for Visual C++ as well. Out of curiosity, where does this actually happen? All I can find is ATTR_NAME in LanguageSettingsExtensionManager and a few related classes, but nothing really concrete about how the language provider is selected.

BTW are you an end user or a vendor integrating CDT?

I'm a user of CDT, currently working on introducing basic support for CDT and its counterpart Eclipse JDT into OpenJDK

@jonahgraham
Copy link
Copy Markdown
Member

Am I understanding the problem correctly? Please feel free to share your thoughts on this.

Ah, seems like it's been hardcoded from the start. That's unfortunate.

Not to get too pedantic, but it isn't hardcoded. Its just that by default with Eclipse CDT you only get projects for the GNU toolchains. The complication is that how llvm/visual C++ has been supported is by making them look like GNU tool chains, rather than by making them first class citizens (like ARM, Renesas, IAR etc do).

I typically use the default CDT setup, which was where I first noticed that while switching to the Visual C++ integration (as a test for something unrelated) that the language name would erroneously remain as "GNU C/C++", and assumed it was a visual error that could be rather easily fixed with a quick config switch.

I did too at first, I had hoped it was a place that someone had overly said "GNU" when it didn't apply.

Seems like this isn't the case (I didn't realize that the language name was actually used as the marker for choosing the language implementation), so I apologize for the noise.

No worries. There is much about CDT I don't know and it was good for me to learn the finer details on this. For now I am going to close this PR though.

That aside, seems a little strange to me that the setting for the language name is language.name.gcc in what I assume is the core part of CDT however.

Not sure I follow this part. Do you mean it is strange that GNU C/C++ is included in org.eclipse.cdt.core? If so, I agree, it means that it is impossible to ship a CDT install that doesn't have GNU C/C++ as one of the available languages. If I was a tool vendor, I may have liked the idea of having only my variant of the C/C++ language show up in my install.

Speaking of Visual C++, CDT has handled quite a number of its extensions quite well actually, so I'm surprised to hear that support for it is spotty at best

Sorry, didn't mean to imply it was worse than you observe. But IIRC you can write GNU C in the editor, CDT will happily accept it, but Visual C++ will error on it. And the inverse can happen too (see 549299 and 69633 for a couple of examples).

Independently of this issue though, it might be a good idea to also move assigning of language backends to also use the language id instead of the display name as has already been done with fa8a627, and I'd be happy to help do that, as well as fixing the display name for Visual C++ as well. Out of curiosity, where does this actually happen?

AFAIK that is how it already happens with IDs. The .cproject will store a language mapping that maps from a content type (e.g. C Source file) to a language implementation (e.g. GNU C language). For example, if I can map C source files (.c extension) to be analysed as a C++ file by changing the language mappings. This creates the following snippet of XML in the .cproject file: (This can be useful with projects like GDB, which is now written in C++ but kept file names as .c)

	<storageModule moduleId="org.eclipse.cdt.core.language.mapping">
		<project-mappings>
			<content-type-mapping
				configuration=""
				content-type="org.eclipse.cdt.core.cSource"
				language="org.eclipse.cdt.core.g++"/>
		</project-mappings>
	</storageModule>

All I can find is ATTR_NAME in LanguageSettingsExtensionManager and a few related classes, but nothing really concrete about how the language provider is selected.

Have a look at:

* Returns the effective language for the file specified by the given path.
* If <code>fetchAll</code> is <code>true</code> all inherited language
* mappings will be returned in order of precedence. Otherwise, only the
* effective language will be returned.
*
* This method will always return at least one mapping.
*
* @param project the project that contains the given file
* @param filePath the path to the file
* @param contentTypeId the content type of the file (optional)
* @param fetchAll if <code>true</code>, returns all inherited language mappings.
* Otherwise, returns only the effective language.
* @return the effective language for the file specified by the given path.
* @throws CoreException
*/
public static LanguageMapping[] computeLanguage(IProject project, String filePath,
ICConfigurationDescription configuration, String contentTypeId, boolean fetchAll) throws CoreException {

BTW are you an end user or a vendor integrating CDT?

I'm a user of CDT, currently working on introducing basic support for CDT and its counterpart Eclipse JDT into OpenJDK

Thanks, I have subscribed to notifications.

@TheShermanTanker TheShermanTanker deleted the patch-1 branch November 29, 2022 02:35
@TheShermanTanker
Copy link
Copy Markdown
Contributor Author

TheShermanTanker commented Nov 29, 2022

The complication is that how llvm/visual C++ has been supported is by making them look like GNU tool chains, rather than by making them first class citizens (like ARM, Renesas, IAR etc do).

I think I know what my next commit will be ;)

AFAIK that is how it already happens with IDs. The .cproject will store a language mapping that maps from a content type (e.g. C Source file) to a language implementation (e.g. GNU C language). For example, if I can map C source files (.c extension) to be analysed as a C++ file by changing the language mappings. This creates the following snippet of XML in the .cproject file: (This can be useful with projects like GDB, which is now written in C++ but kept file names as .c)

That raises an interesting question: Can one (in theory) assign the language to a specific implementation by tampering with the xml in .cproject irrespective of what toolchain is being used? (say, something ridiculous like marking a workspace compiled by gcc as "MSVC C")

That ability, if not a thing yet, may or may not come in handy in particular when using the "No ToolChain" option

@jonahgraham
Copy link
Copy Markdown
Member

That raises an interesting question:

Yes. But you don't need to hack the .cproject, just change the language mappings in the project properties.

@TheShermanTanker
Copy link
Copy Markdown
Contributor Author

That raises an interesting question:

Yes. But you don't need to hack the .cproject, just change the language mappings in the project properties.

Thanks. Can I also define custom languages this way, by any chance? (Eg copying the language definition block in plugin.xml and putting it into the .cproject or language.settings.xml)

@jonahgraham
Copy link
Copy Markdown
Member

Can I also define custom languages this way, by any chance?

I don't think so, I think only the ID can be in the .cproject and that ID needs to be defined in the plugin.xml.

@jonahgraham jonahgraham removed this from the 11.0.0 milestone Dec 1, 2022
@jonahgraham jonahgraham removed the noteworthy Pull requests and fixed issues that should be highlighted to users label Dec 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants