Fixes issue #196#197
Conversation
|
@wmiler why don't you title this something like "patch for linux udev rules" and go ahead and merge it. Because I don't see it being an issue on Mac or Windows unless somebody really tries to play with serial ports. Then use it to get your udev rule issue sorted out. Test a properly functioning udev setup with the old code. And if no issue exists then this can be easily reverted if desired. Simply because if somebody DOES play with serial ports this can create a downstream issue in Hamlib. I think this was a misunderstanding that there's a problem in our code with the dynamic_cast when there's really not. The issue was caused by invalid configuration that you were trying to force, in which case the kernel is going to protect itself from that. With any valid configuration, even if the wrong serial port is selected, the kernel is not going to kill it with the old code and it will just display the error, which it does on Debian 13 in my testing (with the old code) although it takes it awhile to decide that Test CAT failed. I have no clue why Test CAT passed on your system, because it actually initially turns green, then wait a bit and it turns red and throws the error (with the old code). But, as usual, not all linux systems are created equal. |
|
I'm up for changing it, it doesn't skip the validate call at all, you enter your own /dev/mySquirrellyPort and it will give you the "invalid PTT port". Maybe I missed a check in regards to the Test CAT mode. I can rework it so it only handles the null reference issue. |
|
The null reference is not actually an issue because the UI effectively limits your choices to valid serial ports. This is done in the TransceiverFactory PTT enum method, and then set_rig_invariants() dynamically enables/disables PTT options. If hardware handshake is enabled and CAT is using the same serial port, RTS will be disabled. If the PTT port is set to CAT but indirect serial PTT isn’t available for the rig, CAT will be disabled. So if you choose a PTT method that requires its own serial line you MUST choose a valid, enabled port from the list - you can't type in your own. If findText(ptt_port) returns -1 you get the null pointer to prevent access to an object that has not been properly initialized and dereference it. This should not crash on any modern operating system. But not all compilers are created equal, different architectures are not created equal, and not even all linux systems are created equal. So it may crash on your system by calling ->isEnabled() on a null pointer or invalid object, but it doesn't on mine. So the way you "fixed it" is by bypassing that check to allow entry of an invalid serial configuration that has not been initialized with use of a boolean and if statements. I would say the proper way to "fix it" for your system is to check for index >= 0 && != nullptr before isEnabled(). Don't allow attempted validation of an invalid object by setting it to valid by default, then run if statements on it to invalidate it because this will pass for a CAT check and invalid symlinked serial port. All you need to do is check to make sure the object !=nullptr if your system can't properly handle that, then the object is dereferenced. |
|
Ok, thank you. That helps a lot! |
|
@wmiler I had a terribly busy day here on John Deere stuff. But here is a fairly simple modification that should fix your crash issue on linux, and it retains the null reference to dereference the object if it's invalid. Basically, all that was needed was to change around the logic a bit. We can use the standard models that have already been defined and construct an index that we can increment if either a valid or invalid port is found. Then check it and if it's invalid don't use a pointer to ->isEnabled(). This prevents access to protected memory when the nullptr is dereferenced, and will also protect against declaring an invalid serial port as valid. In the constraints of this check, it should allow you type anything you want into the combo box for either DTR or RTS, attempt a Test CAT and immediately throw the "invalid PTT port". OTOH, if you select a valid serial port from the dropdown list that is not correct, attempt Test CAT, it should then give you the Hamlib dialog instead that says it's no good. There are outside instances where if you are trying to create a serial interface with udev rules, you may get a successful Test CAT. But if you test PTT or Tune from the UI, it should then return the Hamlib dialog that says it's no good. The reason for this is that you can make it look valid for the check so it passes by using a symlinked port. But if the port configuration is not valid (bad udev rule) Hamlib will fault instead. Either way it prevents passing nullptr to ->isEnabled(). |
|
@wmiler did you have time to test that patch I posted for linux? I have not personally had time to test it with a linux system, but verified it works ok on MacOS. I'm hoping it works on linux, as based on your description with the traceback in the issues category it looks like linux (or the gcc compiler) doesn't like nullptr being passed to ->isEnabled(). It will not, however, fix any issues with a virtual serial port created with udev rules. I don't know of any way we can force acceptance of a serial interface that is not valid, and do it cleanly for memory management. I only patched what I believe is causing the issue with the kernel sending a SIGSEGV to JS8Call on your system. |
|
@Chris-AC9KH Chris, sorry, I haven't had a chance. Should have time this weekend, having a broken shoulder plus all the backlog at work because of it has been horrible. 😩 |
|
@wmiler not a big deal. I've been very busy with John Deere software on iPads for spring, also. If it does't work on linux, let me know and I'll fix it. If it does, just use it to modify the source and merge it. I'm pretty sure it will work, and if I get a free moment to set up a development environment on linux I can try it as well. I think you said you were using Ubuntu 24, and that's one that I don't have installed to try it with. |
|
@Joe-K0OG if you are available, maybe you could help test this too? You can read about the original issue here to see if you can verify that: Being I'm on a VM with arm64 linux I may not have a test environment that properly duplicates the issue on actual x86_64 hardware, being this is hardware/kernel level stuff. I don't think you have to use any udev rules, just try typing in any old serial port that doesn't exist (which should be the same as one created by a udev rule that is invalid) and see what it does when you try a Test CAT on a DTR or RTS line with an invalid serial port. Then, using the block of code I posted above, patch Configuration.cpp starting at line 2805 thru 2816 in the current version of that file, and see if it fixes it. Hopefully you are able to duplicate Wyatt's crash issue and trying the modified code will give us a double check on if what I wrote there actually fixes it. |
@Chris-AC9KH and @wmiler If I try to point to a port that does not exist (e.g., /dev/ttyUSB1) it does not crash anything, but JS8Call just pops up a "Rig failure" error window when I try to Test CAT saying that it can't access that port and I need to go into settings or use a different configuration. If I symlink my actual port /dev/ttyUSB0 to something like /dev/FT817, then in JS8Call access /dev/FT817, it all works fine, no errors, CAT and PTT work fine. 73, |
|
@Joe-K0OG is that with the unmodified code? Or with the patch above? Just be clear, don't use the code in this branch. I don't particularly like how that was done. I posted an alternative fix above in the comments. For initial testing, use current master. Then patch master with the code block I posted above and see what happens. |
I'm sorry, I forgot to mention: This is with unmodified code. I'm in the process of testing modified code, using js8call-improved/main as the source, testing on both Linux and Windows. I'll get back to you in a little while after testing. 73, |
Interesting. So not all linux systems are affected. I was hoping you could verify Wyatt's original issue but it's looking like that might be only applicable to Ubuntu 24 (so far). I wonder if Ubuntu builds their own kernel, or applies their own patches to it? |
@Chris-AC9KH @wmiler I pasted the block of code as you specified, built JS8Call, and of course since I didn't have a problem before, it didn't change the result any, so no harm done by that block of code on Linux. This was testing based on js8call-improved/main, on Linux Mint 22.3 and Win11, and I generated clean scratch builds (not incremental). 73, |
@Chris-AC9KH @wmiler I just booted Ubuntu 24.04 and tested it with 2.5.2 AppImage and a fresh build of the current main branch, and there is no hang if I try to access a non-existent port in rig setup. It behaves nicely, displaying the "Rig failure" window as it should, allowing me to enter the settings and fix it, or choose a different configuration -- no crashing. 73, |
|
@Joe-K0OG thanks! That verifies the code works ok on Linux and Windows. I tested it on Mac and it worked fine, but same results - no crashes with the original code either. Nor did I get a crash on Debian 13 with the original code. So if it fixes Wyatt's crash issue on Ubuntu 24, I think I'd go with what I wrote and pasted in the comments vs the change that was made the PR. The PR change does not use the nullptr so elsewhere in the code we end up with a conflict with CAT handshake and it never dereferences a bad port. I'm convinced the real issue on Ubuntu was passing the nullptr to ->isEnabled(), which I don't know why it would fail. But Wyatt's crash issue is real so I think that will be a good patch for it. |
|
@Chris-AC9KH Chris, yes, your change does fix the null ref issue. Thank you! |
Yes, Ubuntu patches the mainline kernel. Not sure what all they do there...
I'm thinking there may be a "strangeness" in the way qserialport does things. We do build with libudev for linux, an qt is supposed to look at the udev first, and then failback to /sys/class if it can't enumerate for some reason. Not sure why we fail, but sdrangel succeeds at this. I'll stick it on my backlog list to poke at. PS @Joe-K0OG If you can grab the AppImage from this last run and give it a whirl, I'd appreciate it. |
|
Ubuntu has a reputation of supporting more varied and different hardware out of the box. Since all linux "drivers" are in the kernel as "modules", they probably patch the kernel for that. This change will not affect any other systems, I was actually surprised that it affected Ubuntu. But for whatever reason the kernel didn't like it the way it was, and all I did was prevent the nullptr from being passed to ->isEnabled(). So if that fixes it, just go ahead and merge it in. |



Fixes a null reference issue found in #196.