fix(phone-api): skip manifest scan for node-info-only config requests#10754
Merged
thebentern merged 5 commits intoJun 23, 2026
Merged
Conversation
The recursive filesystem traversal in getFiles() could grow its result
vector without limit, contributing to heap pressure on memory-constrained
devices during client config starts. The implementation also relied on
unbounded strcpy and per-entry String() allocations for dot-suffix
filtering, and leaked the directory handle when the opened path was not
a directory.
Rework the traversal to address all of those at once:
- Add optional maxCount (default 64) and wasLimited out-param. The walk
stops once maxCount entries have been collected and wasLimited is set,
so callers can distinguish a complete listing from a truncated one.
Default arguments preserve backward compatibility for existing callers.
- Extract the recursive walk into a collectFiles() helper in an
anonymous namespace, accumulating into a single caller-owned vector
instead of building and inserting temporary vectors at every recursion
level.
- Replace strcpy with strlcpy so file_name writes are bounded by the
destination buffer.
- Replace String(name).endsWith(".") with a stack-based
pathEndsWithDot() helper to avoid a heap allocation per entry.
- Close the directory handle when the opened path is not a directory,
and close the file in both branches of recursion instead of only the
leaf branch (the previous code leaked the handle when depth was
exhausted).
- Compile-gate the std::bad_alloc reserve probe-down behind
__cpp_exceptions / __EXCEPTIONS so the fallback only compiles where
bad_alloc is actually catchable. No-exception targets (nRF52, RP2040,
STM32, ESP32-C6) skip the probe and rely on the bounded maxCount
alone, which is safe because the default cap is small enough to fit
comfortably in available RAM on every supported platform.
A client can request a node-info-only config response by sending want_config_id with the SPECIAL_NONCE_ONLY_NODES nonce. The manifest walk in handleStartConfig() ran unconditionally even though STATE_SEND_FILEMANIFEST short-circuits past the manifest on that path, so the SPI lock and recursive filesystem traversal were wasted work. On memory-constrained devices this extra heap pressure was enough to trigger allocation aborts and spontaneous reboots before the node-info response could complete. Gate the manifest call behind config_nonce != SPECIAL_NONCE_ONLY_NODES so the node-info-only path no longer touches the filesystem at all, and release any stale manifest storage left over from a previous run. For normal full-config requests, also tighten the scan limits from depth 10 / unbounded to depth 3 / max 64 entries via the new getFiles() parameters. A 10-level recursive listing was never useful for a client file picker and amplified heap pressure for no benefit. Log a warning when the listing was truncated so operators can distinguish a bounded result from a genuinely empty filesystem. A few small cleanups fall out of the above: - Replace raw spiLock->lock()/unlock() with a LockGuard so the lock is released even if getFiles() throws on exception-enabled targets. - Use a releaseFilesManifest() helper (swap-with-empty idiom) in close() and the STATE_SEND_FILEMANIFEST drain path instead of the previous clear()+shrink_to_fit() pair. Equivalent behavior, but the swap idiom is the canonical vector-freeing pattern and centralizing it avoids divergence between call sites. - Use %zu for the size_t log arguments.
Close each child directory entry before recursive descent so bounded manifest scans do not stack file handles on constrained filesystems. Treat null or overlong paths as truncated listings, handle reserve length failures like allocation failures, and keep manifest constants aligned with repository naming.
Contributor
⚡ Try this PR in the Web FlasherWarning This is an automated, unreviewed CI test build. Back up your device configuration Supported boards built by this PR (22)
Build artifacts expire on 2026-07-21. Updated for |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR addresses a reported firmware crash during Android WiFi/TCP handshake recovery by avoiding filesystem manifest enumeration on the node-info-only config path, and by making filesystem manifest enumeration bounded and less allocation-heavy.
Changes:
- Skip filesystem manifest loading for
SPECIAL_NONCE_ONLY_NODESinPhoneAPIand aggressively release any existing manifest buffer on that path. - Add bounded manifest scanning controls (depth + max entries) and log when the scan was limited.
- Extend
getFiles()to support bounded recursive traversal with fewer allocations and safer path copying.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/mesh/PhoneAPI.cpp | Skips manifest scan for node-info-only config requests; bounds normal scans and releases manifest storage more explicitly. |
| src/FSCommon.h | Extends getFiles() API to accept maxCount and an optional wasLimited out-param (with a default max of 64). |
| src/FSCommon.cpp | Reworks getFiles() to use a single accumulator vector, bounded recursion, safer copies, and optional reserve fallback under exceptions. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
8 tasks
Contributor
|
@copilot resolve the merge conflicts in this pull request |
…ip-manifest-scan # Conflicts: # src/FSCommon.cpp
thebentern
added a commit
that referenced
this pull request
Jun 24, 2026
… getFiles (#10778) Forward-port of #10754 and #10757 from master (2.7) into develop, so the develop->master 2.8 promotion (#10777) doesn't drop them. #10754: PhoneAPI no longer walks the filesystem to build the file manifest on node-info-only config requests (SPECIAL_NONCE_ONLY_NODES), which never consume it. getFiles() is now bounded (default 64 entries, depth 3) via collectFiles(), takes an optional wasLimited out-param, and reserves capacity with a bad_alloc/ length_error fallback. The manifest vector is freed via swap (releaseFilesManifest). #10757: getFiles()/collectFiles() now guard against empty file names returned by the Adafruit LittleFS nRF52 glue (issue 4395). Ported by hand rather than cherry-picked: master had reflowed FSCommon.cpp to a different brace style (every line conflicted), #10754 already subsumes #10757, and develop carries a MESHTASTIC_EXCLUDE_FILES_MANIFEST path (nRF54L15) that master lacks. The exclude path is preserved and now also short-circuits + frees the manifest. Verified: native Docker suite 448/448, clang-format clean.
raghumad
pushed a commit
to raghumad/mezulla-firmware
that referenced
this pull request
Jun 25, 2026
… getFiles (meshtastic#10778) Forward-port of meshtastic#10754 and meshtastic#10757 from master (2.7) into develop, so the develop->master 2.8 promotion (meshtastic#10777) doesn't drop them. meshtastic#10754: PhoneAPI no longer walks the filesystem to build the file manifest on node-info-only config requests (SPECIAL_NONCE_ONLY_NODES), which never consume it. getFiles() is now bounded (default 64 entries, depth 3) via collectFiles(), takes an optional wasLimited out-param, and reserves capacity with a bad_alloc/ length_error fallback. The manifest vector is freed via swap (releaseFilesManifest). meshtastic#10757: getFiles()/collectFiles() now guard against empty file names returned by the Adafruit LittleFS nRF52 glue (issue 4395). Ported by hand rather than cherry-picked: master had reflowed FSCommon.cpp to a different brace style (every line conflicted), meshtastic#10754 already subsumes meshtastic#10757, and develop carries a MESHTASTIC_EXCLUDE_FILES_MANIFEST path (nRF54L15) that master lacks. The exclude path is preserved and now also short-circuits + frees the manifest. Verified: native Docker suite 448/448, clang-format clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This change fixes a firmware crash I hit while testing the Android WiFi/TCP handshake recovery work that landed in meshtastic/Meshtastic-Android#5856.
During Android Stage 2, the app sends a node-info-only config request. On my tbeam running
v2.7.25.104df5f, the firmware logged that the client only wanted node info, then crashed/rebooted before completing the node-info response.The main fix is to skip filesystem manifest loading entirely for the
SPECIAL_NONCE_ONLY_NODESpath. That path does not use the file manifest, so avoiding the SPI lock and filesystem walk removes unnecessary memory pressure during node-info-only config starts.This also hardens filesystem manifest enumeration by making it bounded and less allocation-heavy.
Key Changes
PhoneAPI
SPECIAL_NONCE_ONLY_NODES.364Filesystem Enumeration
getFiles()with optionalmaxCountandwasLimitedparameters.getFiles()to at most 64 entries.strlcpyinstead ofstrcpyfor file names.Stringallocation for dot-suffix filtering.std::bad_alloc/std::length_errorreserve fallback so no-exception targets still build.Validation
Firmware built from
86f062657c95a988492d35979e972ed8d3a4a350has been running well for about two days across multiple tbeam, rak4631, and T1000-E devices.With that firmware, Android can connect without triggering the crash/reboot loop I was seeing before.
The latest branch includes a few additional review-suggested filesystem traversal hardening changes, which I am loading up for more testing now.
This was tested together with the latest Android-side WiFi/TCP recovery changes in meshtastic/Meshtastic-Android#5856.
Review Notes
This patch deserves careful maintainer review. I have done the best I can to review the C/C++ side with AI tooling and device testing, but I do not have much experience with C yet.
The main behavior change to call out is that
getFiles()now limits results to 64 entries by default. Existing callers still compile because the new parameters have defaults, but callers that need a larger or complete listing should pass a largermaxCountand can usewasLimitedto detect truncation.Breaking Changes / Migration
getFiles()now limits results by default to at most 64 entries.Existing callers continue to compile due to default arguments, but they may now receive a truncated listing. Callers that require more entries should pass a larger
maxCountand optionally checkwasLimited.Attestations