Skip to content

feat: include special directories that won't be indexed#11163

Merged
jvillafanez merged 4 commits intomasterfrom
search_prevent_file_indexing
Apr 1, 2025
Merged

feat: include special directories that won't be indexed#11163
jvillafanez merged 4 commits intomasterfrom
search_prevent_file_indexing

Conversation

@jvillafanez
Copy link
Member

Description

Skip indexing some special files.

  • The . directory was indexed with the username as name. For the admin user, searching using "admin" would show results despite the personal space having unrelated files.
  • The ./.space directory contains some space-related data (the space's image, readme file...) that was being indexed. Users are unlikely to look for this type of content.

Related Issue

#11028

Motivation and Context

Users might think there are issues if weird results are being returned (despite being "correctly" indexed). It is expected that we search the contents of the space, but not the space itself nor related data.

How Has This Been Tested?

Manually checked with the bleve cli. Those files aren't indexed any longer and they won't appear in the search results.

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Technical debt
  • Tests only (no source changes)

Checklist:

  • Code changes
  • Unit tests added
  • Acceptance tests added
  • Documentation ticket raised:

Notes

This PR will prevent those files from being indexed. However, in case of upgrades, the files are expected to be indexed already (this code wasn't available when the files were indexed)
For now, oCIS doesn't provide a way to remove the files from the index. You'll need to manually remove the entries from the index.

@jvillafanez jvillafanez self-assigned this Mar 25, 2025
@update-docs
Copy link

update-docs bot commented Mar 25, 2025

Thanks for opening this pull request! The maintainers of this repository would appreciate it if you would create a changelog item based on your changes.

@jvillafanez
Copy link
Member Author

Current test failures are "expected".

  • Some tests are failing because the name of the space itself isn't indexed any longer. This is an assumption made in some tests that isn't true with this PR.
  • One test searches for the ".space" folder (as hidden folder). While hidden folders are still indexed, the ".space" is considered special because it contains files that aren't explicitly uploaded by the user and it isn't indexed with this PR.

Before adjusting the tests, I think we need to agree if we're fine with this solution.

@jvillafanez
Copy link
Member Author

While trying to add new unit tests I've found https://github.com/owncloud/reva/blob/main/pkg/storage/utils/walker/walker.go#L81-L82

I suggest to swap those lines.
This PR wants to skip the .space directory, so it doesn't make sense to list its contents when they're going to be skipped.

@jvillafanez
Copy link
Member Author

@nirajacharya2 could you adjust the failing tests? #11163 (comment) contains the reasons why the tests are failing. It's probably easier if you add the changes in this branch so we have the tests passing here.

I suggest to swap those lines.
This PR wants to skip the .space directory, so it doesn't make sense to list its contents when they're going to be skipped.

I'll leave it for a different PR because those are changes in reva, so we'd need to update the reva version (once those changes are implemented) and then adjust the test.

@nirajacharya2 nirajacharya2 force-pushed the search_prevent_file_indexing branch from 61c4147 to 40ea697 Compare March 31, 2025 05:54
@nirajacharya2 nirajacharya2 force-pushed the search_prevent_file_indexing branch from 40ea697 to e78caf9 Compare March 31, 2025 05:55
@sonarqubecloud
Copy link

@jvillafanez jvillafanez merged commit b29dc77 into master Apr 1, 2025
4 checks passed
@jvillafanez jvillafanez deleted the search_prevent_file_indexing branch April 1, 2025 13:32
ownclouders pushed a commit that referenced this pull request Apr 1, 2025
feat: include special directories that won't be indexed
ownclouders pushed a commit that referenced this pull request Apr 2, 2025
feat: include special directories that won't be indexed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants