Skip to content

Fixes issue #196#197

Merged
wmiler merged 2 commits into
JS8Call-improved:masterfrom
wmiler:wmiler-issue196
Mar 2, 2026
Merged

Fixes issue #196#197
wmiler merged 2 commits into
JS8Call-improved:masterfrom
wmiler:wmiler-issue196

Conversation

@wmiler

@wmiler wmiler commented Feb 23, 2026

Copy link
Copy Markdown
Collaborator

Fixes a null reference issue found in #196.

@wmiler wmiler added this to the 2.5.3 milestone Feb 23, 2026
@wmiler wmiler self-assigned this Feb 23, 2026
@wmiler wmiler added the bug Something isn't working label Feb 23, 2026
@Chris-AC9KH

Copy link
Copy Markdown
Collaborator

So here's what happens. With this code I can symlink /dev/MyGreatSerialPort to /dev/cu.usbserial-2110 on MacOS and it will accept it vs the enumerated serial ports.

Screenshot 2026-02-22 at 20 07 19

It skips over validation of the port because I'm using CAT, but the flag that was set validates it anyway as it skips over and only checks DTR and RTS, or if the port is empty. Finds the port in /dev, everything is good to go. Press Test CAT and it turns green even though it doesn't get CAT control. Click OK to close the dialog and then the Tune button on the UI and the kernel instantly kills it by sending a SIGSEGV because we got a serial bus conflict with an invalid address.

Screenshot 2026-02-22 at 19 57 55

Now, is anybody stupid enough to do this on Mac? No. Everybody uses the enumerated ports. Will it ever cause an issue? Not likely. But the old code doesn't do this because it uses a dynamic cast to convert to classes by inheritance and returns a null pointer if it fails. Which doesn't cause anything bad to happen on Mac other than a Hamlib dialog that says the serial port is no good. Nor did it happen on Debian 13 until I tried different software and got different results. Bad serial ports, every time, depend on hardware, motherboard chipsets, drivers, etc..

So this is based on what happened on ONE system. But if this fixes your crash issue on linux due to a bad USB configuration, it is likely it won't hurt anything on the other two platforms. Even though I can verify a bad CAT configuration with this code on MacOS, it is very remote that anybody would ever discover that.

@Chris-AC9KH

Copy link
Copy Markdown
Collaborator

So, actually, I don't like this. Using the unmodified original code, and creating a udev rule using the methodology being used here, the system refuses to enumerate an invalid serial port. So you have to force it by manually entering it into the combo box. This causes the same thing to happen on Debian as it it does MacOS - Test CAT fails.

Screenshot 2026-02-23 at 07 39 35

The UI Configuration dialog is designed to not allow an invalid serial port configuration and it works perfectly as it was. Attempting to FORCE an invalid serial port configuration can cause a kernel panic because this is kernel space stuff.

  • so now we attempt to force the UI Configuration to accept an invalid serial port because on a different system we got a kernel panic and it sent a SIGSEGV. There must be some kind of problem.
  • this was due to a bad udev rule. It is bad because the system does not enumerate the "fake" serial port
  • So we bypass the UI Configuration checks to force acceptance of this bad serial port
  • This creates a condition with conflicting addresses on the serial bus and panics the kernel from the main UI. Tested with wsjt-x you can crash the entire system attempting to do this.

So the question here is, why are we not fixing the bad serial port? No system that I'm aware of will enumerate an invalid serial port and there's a reason the kernel sends a SIGSEGV if you try to do such a thing. It's protecting itself from an invalid configuration. But that doesn't mean you got bad code. It means you're trying to force a bad base system configuration that is not allowed by creating an invalid serial port that will not enumerate.

Tested with unmodified v2.5.2 AppImage running on Debian 13, as well as MacOS. I don't know how to create an invalid serial port on Windows, but I'm sure it can be done there too.

So you can merge this if you want. I don't THINK it will hurt anything on the other two systems? But I'm not touching this one with a 10ft pole because further testing reveals that once you bypass the original UI Configuration check where it only enumerates valid serial interfaces in the first place, you can cause a kernel panic from the main UI when Hamlib attempts to execute PTT and it finds conflicting addresses on the bus that bypassed the Configuration checks.

I'd say this is maybe diagnostic code to figure out a problem with udev rules on linux. It is not production code.

@Chris-AC9KH

Copy link
Copy Markdown
Collaborator

@wmiler why don't you title this something like "patch for linux udev rules" and go ahead and merge it. Because I don't see it being an issue on Mac or Windows unless somebody really tries to play with serial ports. Then use it to get your udev rule issue sorted out. Test a properly functioning udev setup with the old code. And if no issue exists then this can be easily reverted if desired. Simply because if somebody DOES play with serial ports this can create a downstream issue in Hamlib.

I think this was a misunderstanding that there's a problem in our code with the dynamic_cast when there's really not. The issue was caused by invalid configuration that you were trying to force, in which case the kernel is going to protect itself from that. With any valid configuration, even if the wrong serial port is selected, the kernel is not going to kill it with the old code and it will just display the error, which it does on Debian 13 in my testing (with the old code) although it takes it awhile to decide that Test CAT failed.

I have no clue why Test CAT passed on your system, because it actually initially turns green, then wait a bit and it turns red and throws the error (with the old code). But, as usual, not all linux systems are created equal.

@wmiler

wmiler commented Feb 23, 2026

Copy link
Copy Markdown
Collaborator Author

I'm up for changing it, it doesn't skip the validate call at all, you enter your own /dev/mySquirrellyPort and it will give you the "invalid PTT port". Maybe I missed a check in regards to the Test CAT mode. I can rework it so it only handles the null reference issue.

@Chris-AC9KH

Chris-AC9KH commented Feb 23, 2026

Copy link
Copy Markdown
Collaborator

The null reference is not actually an issue because the UI effectively limits your choices to valid serial ports. This is done in the TransceiverFactory PTT enum method, and then set_rig_invariants() dynamically enables/disables PTT options.

If hardware handshake is enabled and CAT is using the same serial port, RTS will be disabled. If the PTT port is set to CAT but indirect serial PTT isn’t available for the rig, CAT will be disabled. So if you choose a PTT method that requires its own serial line you MUST choose a valid, enabled port from the list - you can't type in your own.

If findText(ptt_port) returns -1 you get the null pointer to prevent access to an object that has not been properly initialized and dereference it. This should not crash on any modern operating system. But not all compilers are created equal, different architectures are not created equal, and not even all linux systems are created equal. So it may crash on your system by calling ->isEnabled() on a null pointer or invalid object, but it doesn't on mine.

So the way you "fixed it" is by bypassing that check to allow entry of an invalid serial configuration that has not been initialized with use of a boolean and if statements.

I would say the proper way to "fix it" for your system is to check for index >= 0 && != nullptr before isEnabled(). Don't allow attempted validation of an invalid object by setting it to valid by default, then run if statements on it to invalidate it because this will pass for a CAT check and invalid symlinked serial port. All you need to do is check to make sure the object !=nullptr if your system can't properly handle that, then the object is dereferenced.

@wmiler

wmiler commented Feb 23, 2026

Copy link
Copy Markdown
Collaborator Author

Ok, thank you. That helps a lot!

@Chris-AC9KH

Chris-AC9KH commented Feb 24, 2026

Copy link
Copy Markdown
Collaborator

@wmiler I had a terribly busy day here on John Deere stuff. But here is a fairly simple modification that should fix your crash issue on linux, and it retains the null reference to dereference the object if it's invalid.

    auto ptt_method = static_cast<TransceiverFactory::PTTMethod>(
        ui_->PTT_method_button_group->checkedId());
    if (ptt_method == TransceiverFactory::PTT_method_DTR ||
        ptt_method == TransceiverFactory::PTT_method_RTS) {
        const auto ptt_port = ui_->PTT_port_combo_box->currentText();
        auto *model = dynamic_cast<QStandardItemModel *>(ui_->PTT_port_combo_box->model());
        const int index = ui_->PTT_port_combo_box->findText(ptt_port);
        QStandardItem *item = (index >= 0 && model) ? model->item(index) : nullptr;

        const bool enabled = item && item->isEnabled();
        const bool invalid = ptt_port.isEmpty() || !enabled;

        if (invalid) {
            JS8MessageBox::critical_message(this, tr("Invalid PTT port"));
            return false;
        }
    }

Basically, all that was needed was to change around the logic a bit. We can use the standard models that have already been defined and construct an index that we can increment if either a valid or invalid port is found. Then check it and if it's invalid don't use a pointer to ->isEnabled(). This prevents access to protected memory when the nullptr is dereferenced, and will also protect against declaring an invalid serial port as valid.

In the constraints of this check, it should allow you type anything you want into the combo box for either DTR or RTS, attempt a Test CAT and immediately throw the "invalid PTT port". OTOH, if you select a valid serial port from the dropdown list that is not correct, attempt Test CAT, it should then give you the Hamlib dialog instead that says it's no good.

There are outside instances where if you are trying to create a serial interface with udev rules, you may get a successful Test CAT. But if you test PTT or Tune from the UI, it should then return the Hamlib dialog that says it's no good. The reason for this is that you can make it look valid for the check so it passes by using a symlinked port. But if the port configuration is not valid (bad udev rule) Hamlib will fault instead.

Either way it prevents passing nullptr to ->isEnabled().

@Chris-AC9KH

Chris-AC9KH commented Feb 26, 2026

Copy link
Copy Markdown
Collaborator

@wmiler did you have time to test that patch I posted for linux? I have not personally had time to test it with a linux system, but verified it works ok on MacOS. I'm hoping it works on linux, as based on your description with the traceback in the issues category it looks like linux (or the gcc compiler) doesn't like nullptr being passed to ->isEnabled().

It will not, however, fix any issues with a virtual serial port created with udev rules. I don't know of any way we can force acceptance of a serial interface that is not valid, and do it cleanly for memory management. I only patched what I believe is causing the issue with the kernel sending a SIGSEGV to JS8Call on your system.

@wmiler

wmiler commented Feb 26, 2026

Copy link
Copy Markdown
Collaborator Author

@Chris-AC9KH Chris, sorry, I haven't had a chance. Should have time this weekend, having a broken shoulder plus all the backlog at work because of it has been horrible. 😩

@Chris-AC9KH

Copy link
Copy Markdown
Collaborator

@wmiler not a big deal. I've been very busy with John Deere software on iPads for spring, also. If it does't work on linux, let me know and I'll fix it. If it does, just use it to modify the source and merge it.

I'm pretty sure it will work, and if I get a free moment to set up a development environment on linux I can try it as well. I think you said you were using Ubuntu 24, and that's one that I don't have installed to try it with.

@Chris-AC9KH

Copy link
Copy Markdown
Collaborator

@Joe-K0OG if you are available, maybe you could help test this too? You can read about the original issue here to see if you can verify that:
#196

Being I'm on a VM with arm64 linux I may not have a test environment that properly duplicates the issue on actual x86_64 hardware, being this is hardware/kernel level stuff.

I don't think you have to use any udev rules, just try typing in any old serial port that doesn't exist (which should be the same as one created by a udev rule that is invalid) and see what it does when you try a Test CAT on a DTR or RTS line with an invalid serial port.

Then, using the block of code I posted above, patch Configuration.cpp starting at line 2805 thru 2816 in the current version of that file, and see if it fixes it. Hopefully you are able to duplicate Wyatt's crash issue and trying the modified code will give us a double check on if what I wrote there actually fixes it.

@Chris-AC9KH Chris-AC9KH requested a review from Joe-K0OG February 26, 2026 19:24
@Joe-K0OG

Copy link
Copy Markdown
Collaborator

@Joe-K0OG if you are available, maybe you could help test this too? You can read about the original issue here to see if you can verify that:

@Chris-AC9KH and @wmiler If I try to point to a port that does not exist (e.g., /dev/ttyUSB1) it does not crash anything, but JS8Call just pops up a "Rig failure" error window when I try to Test CAT saying that it can't access that port and I need to go into settings or use a different configuration. If I symlink my actual port /dev/ttyUSB0 to something like /dev/FT817, then in JS8Call access /dev/FT817, it all works fine, no errors, CAT and PTT work fine.

73,
-Joe-
K0OG

@Chris-AC9KH

Chris-AC9KH commented Feb 26, 2026

Copy link
Copy Markdown
Collaborator

@Joe-K0OG is that with the unmodified code? Or with the patch above?

Just be clear, don't use the code in this branch. I don't particularly like how that was done. I posted an alternative fix above in the comments.

For initial testing, use current master. Then patch master with the code block I posted above and see what happens.

@Joe-K0OG

Copy link
Copy Markdown
Collaborator

@Joe-K0OG is that with the unmodified code? Or with the patch above?

I'm sorry, I forgot to mention: This is with unmodified code. I'm in the process of testing modified code, using js8call-improved/main as the source, testing on both Linux and Windows. I'll get back to you in a little while after testing.

73,
-Joe-
K0OG

@Chris-AC9KH

Copy link
Copy Markdown
Collaborator

I'm sorry, I forgot to mention: This is with unmodified code.

Interesting. So not all linux systems are affected. I was hoping you could verify Wyatt's original issue but it's looking like that might be only applicable to Ubuntu 24 (so far). I wonder if Ubuntu builds their own kernel, or applies their own patches to it?

@Joe-K0OG

Copy link
Copy Markdown
Collaborator

Then, using the block of code I posted above, patch Configuration.cpp starting at line 2805 thru 2816 in the current version of that file, and see if it fixes it. Hopefully you are able to duplicate Wyatt's crash issue and trying the modified code will give us a double check on if what I wrote there actually fixes it.

@Chris-AC9KH @wmiler I pasted the block of code as you specified, built JS8Call, and of course since I didn't have a problem before, it didn't change the result any, so no harm done by that block of code on Linux. This was testing based on js8call-improved/main, on Linux Mint 22.3 and Win11, and I generated clean scratch builds (not incremental).

73,
-Joe-
K0OG

@Joe-K0OG

Copy link
Copy Markdown
Collaborator

Interesting. So not all linux systems are affected. I was hoping you could verify Wyatt's original issue but it's looking like that might be only applicable to Ubuntu 24 (so far). I wonder if Ubuntu builds their own kernel, or applies their own patches to it?

@Chris-AC9KH @wmiler I just booted Ubuntu 24.04 and tested it with 2.5.2 AppImage and a fresh build of the current main branch, and there is no hang if I try to access a non-existent port in rig setup. It behaves nicely, displaying the "Rig failure" window as it should, allowing me to enter the settings and fix it, or choose a different configuration -- no crashing.

73,
-Joe-
K0OG

@Chris-AC9KH

Copy link
Copy Markdown
Collaborator

@Joe-K0OG thanks! That verifies the code works ok on Linux and Windows. I tested it on Mac and it worked fine, but same results - no crashes with the original code either. Nor did I get a crash on Debian 13 with the original code.

So if it fixes Wyatt's crash issue on Ubuntu 24, I think I'd go with what I wrote and pasted in the comments vs the change that was made the PR. The PR change does not use the nullptr so elsewhere in the code we end up with a conflict with CAT handshake and it never dereferences a bad port. I'm convinced the real issue on Ubuntu was passing the nullptr to ->isEnabled(), which I don't know why it would fail. But Wyatt's crash issue is real so I think that will be a good patch for it.

@wmiler

wmiler commented Mar 1, 2026

Copy link
Copy Markdown
Collaborator Author

@Chris-AC9KH Chris, yes, your change does fix the null ref issue. Thank you!

@wmiler

wmiler commented Mar 1, 2026

Copy link
Copy Markdown
Collaborator Author

Interesting. So not all linux systems are affected. I was hoping you could verify Wyatt's original issue but it's looking like that might be only applicable to Ubuntu 24 (so far). I wonder if Ubuntu builds their own kernel, or applies their own patches to it?

Yes, Ubuntu patches the mainline kernel. Not sure what all they do there...

@Chris-AC9KH @wmiler I just booted Ubuntu 24.04 and tested it with 2.5.2 AppImage and a fresh build of the current main branch, and there is no hang if I try to access a non-existent port in rig setup. It behaves nicely, displaying the "Rig failure" window as it should, allowing me to enter the settings and fix it, or choose a different configuration -- no crashing.

I'm thinking there may be a "strangeness" in the way qserialport does things. We do build with libudev for linux, an qt is supposed to look at the udev first, and then failback to /sys/class if it can't enumerate for some reason.

Not sure why we fail, but sdrangel succeeds at this. I'll stick it on my backlog list to poke at.

PS @Joe-K0OG If you can grab the AppImage from this last run and give it a whirl, I'd appreciate it.

@Chris-AC9KH

Copy link
Copy Markdown
Collaborator

Ubuntu has a reputation of supporting more varied and different hardware out of the box. Since all linux "drivers" are in the kernel as "modules", they probably patch the kernel for that.

This change will not affect any other systems, I was actually surprised that it affected Ubuntu. But for whatever reason the kernel didn't like it the way it was, and all I did was prevent the nullptr from being passed to ->isEnabled(). So if that fixes it, just go ahead and merge it in.

Comment thread JS8_UI/Configuration.cpp Outdated
@Joe-K0OG

Joe-K0OG commented Mar 1, 2026

Copy link
Copy Markdown
Collaborator

PS @Joe-K0OG If you can grab the AppImage from this last run and give it a whirl, I'd appreciate it.

@wmiler Wyatt, I downloaded the AppImage, and just built from the latest js8call-improved master source on Windows and Linux, and all works fine. Good to go!

73,
-Joe-
K0OG

@wmiler wmiler merged commit 03a6f16 into JS8Call-improved:master Mar 2, 2026
@wmiler wmiler deleted the wmiler-issue196 branch March 2, 2026 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants