Skip to content

Fix intermittent busyRx on Portduino SX1262 (stale preamble IRQ)#9939

Closed
NearlCrews wants to merge 1 commit into
meshtastic:masterfrom
NearlCrews:fix/portduino-busyrx-false-preamble
Closed

Fix intermittent busyRx on Portduino SX1262 (stale preamble IRQ)#9939
NearlCrews wants to merge 1 commit into
meshtastic:masterfrom
NearlCrews:fix/portduino-busyrx-false-preamble

Conversation

@NearlCrews

@NearlCrews NearlCrews commented Mar 18, 2026

Copy link
Copy Markdown

Summary

On Linux-native (Portduino) builds, the SX1262 radio gets stuck in RX with Can not send yet, busyRx / Ignore false preamble detection errors, blocking TX until the existing stale-flag fallback in RadioLibInterface::receiveDetected times out (2 * preambleTimeMsec). Under high radio activity I measured this on 60-80% of TX attempts; at low activity it was sporadic.

Fixes #9933. Related: #9580 (same symptoms on RPi Zero 2W), #4298 (GPIO pin issues with SX1262 on RPi 5).

Root cause

startReceiveDutyCycleAuto puts the SX1262 into a sleep/CAD/sleep cycle. Every CAD pass can latch a PREAMBLE_DETECTED IRQ flag even when no real packet is arriving. On bare-metal MCUs those stale flags get cleared/overwritten within a few symbol times before anything sees them, because IRQ polling happens at nanosecond scale. On Linux, gpiod reads are microseconds each, so getIrqFlags() regularly catches a stale CAD-induced flag. isActivelyReceiving() then returns true, TX is gated, and we wait out the 2 * preambleTimeMsec stale-flag fallback before unblocking — by which time setTransmitDelay() has already pushed the TX attempt into a new random window where the same race can recur.

Fix

On Portduino only, replace startReceiveDutyCycleAuto(preambleLength, 8, ...) with startReceive(RADIOLIB_SX126X_RX_TIMEOUT_INF, ...). Continuous receive has no sleep/CAD cycles, so the stale-flag source is eliminated at the root. Power saving from duty cycling is irrelevant on mains-powered Linux-class devices (Raspberry Pi, x86 gateways, femtofox).

isActivelyReceiving() is unchanged — the existing receiveDetected logic with its 2 * preambleTimeMsec fallback remains the correct place to handle any residual stale flags on any platform.

What this PR no longer does

An earlier revision also tried to accelerate stale-flag recovery inside isActivelyReceiving() on Portduino by clearing PREAMBLE_DETECTED and re-reading after 5 ms. That was wrong on two counts (thanks to @GUVWAF for catching it):

  1. The SX1262 does not re-assert PREAMBLE_DETECTED later in the same reception — preamble detection is a one-shot IRQ, after which the chip moves to sync word → header → payload. If the flag was cleared mid-reception (e.g. between end-of-preamble and HEADER_VALID), the recheck would see "no preamble," declare the reception stale, and TX could fire into an in-progress RX.
  2. The 5 ms recheck window was a constant, but preamble time scales with LoRa settings (preambleLength * (2^sf) / bw). LongFast is ~131 ms; LongSlow is ~524 ms. 5 ms was 25–100× too short to distinguish "real preamble, still decoding" from "stale flag."

That code is removed in the current commit. The continuous-receive change alone is sufficient on my hardware — I did not observe any residual busyRx after applying it.

Behavior change

Platform Before After
Portduino SX126x startReceiveDutyCycleAuto(16-symbol preamble, 8-symbol RX window) startReceive(RX_TIMEOUT_INF, ...)
ESP32 / nRF52 / RP2040 / STM32WL SX126x startReceiveDutyCycleAuto(...) unchanged

No change to isActivelyReceiving on any platform. No change on battery-powered builds.

Test plan

Tested on Raspberry Pi 5 with SX1262 (E22-900M30S) HAT, full config (4 channels, MQTT, boosted RX gain, GPS, neighbor info, telemetry).

  • Before fix: busyRx on 60-80% of TX attempts under heavy traffic (firmware 2.7.15 and 2.7.20)
  • After fix: 0 busyRx across 10 consecutive 45-second test runs
  • After fix: 0 busyRx in a ~24 h sustained soak with mixed traffic
  • Ignore false preamble detection log no longer appears in steady state
  • Packet reception unchanged (NodeInfo / HostMetrics / GPS position / DeviceTelemetry all transmit and receive)
  • Radio init still succeeds (SX126x init result 0)

CLA will be signed before merge.

@github-actions

Copy link
Copy Markdown
Contributor

@NearlCrews, Welcome to Meshtastic!

Thanks for opening your first pull request. We really appreciate it.

We discuss work as a team in discord, please join us in the #firmware channel.
There's a big backlog of patches at the moment. If you have time,
please help us with some code review and testing of other PRs!

Welcome to the team 😄

@github-actions github-actions Bot added first-contribution bugfix Pull request that fixes bugs labels Mar 18, 2026
@NearlCrews NearlCrews force-pushed the fix/portduino-busyrx-false-preamble branch from 58dc30a to afce676 Compare March 18, 2026 18:53
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses intermittent busyRx transmission blocks on Linux-native (Portduino) builds using SX1262 radios by preventing stale PREAMBLE_DETECTED IRQ flags from falsely indicating an active reception, and by avoiding SX1262 duty-cycle auto-receive mode on Portduino.

Changes:

  • Switch Portduino SX1262 receive mode from duty-cycle auto-receive to continuous receive.
  • In isActivelyReceiving() (Portduino only), clear and re-check PREAMBLE_DETECTED when it appears latched without HEADER_VALID to avoid blocking TX on stale IRQ state.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread src/mesh/SX126xInterface.cpp Outdated
Comment on lines +384 to +389
constexpr uint32_t STALE_PREAMBLE_RECHECK_MS = 5;
uint16_t irq = lora.getIrqFlags();
if ((irq & RADIOLIB_SX126X_IRQ_PREAMBLE_DETECTED) && !(irq & RADIOLIB_SX126X_IRQ_HEADER_VALID)) {
lora.clearIrqFlags(RADIOLIB_SX126X_IRQ_PREAMBLE_DETECTED);
delay(STALE_PREAMBLE_RECHECK_MS);
irq = lora.getIrqFlags();
@thebentern thebentern requested a review from GUVWAF March 19, 2026 15:54
Comment thread src/mesh/SX126xInterface.cpp Outdated
constexpr uint32_t STALE_PREAMBLE_RECHECK_MS = 5;
uint16_t irq = lora.getIrqFlags();
if ((irq & RADIOLIB_SX126X_IRQ_PREAMBLE_DETECTED) && !(irq & RADIOLIB_SX126X_IRQ_HEADER_VALID)) {
lora.clearIrqFlags(RADIOLIB_SX126X_IRQ_PREAMBLE_DETECTED);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mention "Real receptions re-assert the flag within a few ms.". This may often be true, but I don't think it's guaranteed. What happens if you clear the IRQ just before the preamble ends and it continues decoding the LoRa header?
Even then, STALE_PREAMBLE_RECHECK_MS can not be a constant, it will depend on preambleTimeMsec, which depends on the LoRa settings. For slow presets, 5ms will not be enough to detect (part of) a preamble.

On Linux-native (Portduino) platforms, the SX1262 radio intermittently
gets stuck in RX state with "Can not send yet, busyRx" errors followed
by "Ignore false preamble detection". This prevents packet transmission
and eventually fills the TX queue.

Root cause: the duty-cycle auto-receive mode periodically runs internal
CAD (channel activity detection) checks, and each CAD cycle can latch a
PREAMBLE_DETECTED IRQ flag in the SX1262 even when no real packet is
arriving. On bare-metal MCUs these stale flags are cleared or overwritten
within a few symbol times before anything reads them, because IRQ-line
polling happens at nanosecond scale. On Linux, GPIO reads through gpiod
take microseconds per access, so getIrqFlags() frequently catches those
stale CAD-induced flags. isActivelyReceiving() then falsely reports the
radio as busy and blocks TX until the existing 2 * preambleTimeMsec
fallback in RadioLibInterface::receiveDetected() times out.

Fix: on Portduino only, use startReceive(RX_TIMEOUT_INF, ...) instead of
startReceiveDutyCycleAuto(). Continuous receive has no sleep/CAD cycles,
so the stale-flag source is eliminated at the root. Power saving from
duty cycling is irrelevant on mains-powered Linux-class devices
(Raspberry Pi, x86 gateways, etc.).

No change to isActivelyReceiving() itself — the existing
receiveDetected() logic (with its 2 * preambleTimeMsec stale-flag
fallback) handles any residual cases correctly on all platforms.

Tested on Raspberry Pi 5 with SX1262/E22-900M30S (PiMesh 1W) across
multiple configs including multi-channel setups with MQTT, boosted RX
gain, and full module config. Before fix: busyRx on 60-80% of TX
attempts under high radio activity, occasional hangs at low activity.
After fix: no spurious busyRx observed; packet flow is stable.

No behavior change on non-Portduino (ESP32 / nRF52 / RP2040 / STM32WL)
builds.
@NearlCrews NearlCrews force-pushed the fix/portduino-busyrx-false-preamble branch from afce676 to 797601f Compare April 21, 2026 16:25
@NearlCrews

Copy link
Copy Markdown
Author

@GUVWAF you are right on both counts — thanks for the careful read. I've pushed an update (797601f, rebased onto current master at 68383c8) that drops the isActivelyReceiving() recheck entirely and keeps only the startReceive(RX_TIMEOUT_INF, ...) change on Portduino.

On your in-flight-reception concern: the SX1262 fires PREAMBLE_DETECTED once per reception (when the preamble detector confirms the pattern) and then moves on to sync word / header / payload — it does not re-assert that flag while the same reception continues. So if my recheck cleared the flag during the tail end of a real preamble, HEADER_VALID might never arrive by the next poll and the code would correctly-but-wrongly mark the reception stale. That's a silent TX-into-RX collision risk, and it's indefensible.

On the preset-dependent timing: right too — STALE_PREAMBLE_RECHECK_MS as a constant was dimensionally wrong. RadioInterface.cpp already tracks the real preamble duration as preambleTimeMsec = preambleLength * (pow_of_2(sf)) / bw, and it varies from ~131 ms (LongFast) to ~524 ms (LongSlow) with 16-symbol preambles. 5 ms was 25–100× too short to distinguish "preamble decoding in progress" from "stale CAD flag."

Why the remaining change is enough. The root cause is startReceiveDutyCycleAuto's internal CAD producing latched PREAMBLE_DETECTED flags that bare-metal MCUs overwrite within microseconds but Linux gpiod reads catch. Switching Portduino to continuous receive eliminates that CAD source, so the stale flags stop being produced. Any residual stale flags from other sources (boot state, noise) are already handled correctly by the existing 2 * preambleTimeMsec fallback in RadioLibInterface::receiveDetected — that path will log Ignore false preamble detection and release TX. On the hardware I have (RPi 5 + E22-900M30S), I did not observe any busyRx after the continuous-receive change alone, including a ~24 h soak.

@Copilot — the 5 ms vs 10 ms mismatch is moot now, that code is gone.

CLA is still unsigned; I'll sign before merge.

@GUVWAF

GUVWAF commented Apr 21, 2026

Copy link
Copy Markdown
Member

I don't see how not using the duty cycle receiving would solve the issue, since this only affects the radio and has nothing to do with handling the IRQ flag. Moreover, with our current preamble length, the startReceiveDutyCycleAuto() function falls back to just startReceive() as the period the radio can sleep is too low, so it's already equivalent to what you propose.

I think the reason it works for you now is because #9895 is merged in the meantime.

@NearlCrews

Copy link
Copy Markdown
Author

You're right on both counts. We pass (preambleLength=16, minSymbols=8) at SX126xInterface.cpp:330, so RadioLib 7.6.0 computes sleepSymbols = 16 - 2*8 = 0 (SX126x.cpp:523), sleepPeriod = 0, and trips the early return at SX126x.cpp:546-547. We land in plain startReceive(RX_TIMEOUT_INF, ...) either way, so the Portduino branch was a no-op for our config.

Plausible that #9895 explains why my soak stopped reproducing the busyRx. I haven't isolated that. Closing this.

@NearlCrews NearlCrews closed this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Pull request that fixes bugs first-contribution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants