Fix importing attachments within processes/assemblies#16202
Fix importing attachments within processes/assemblies#16202
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughReplaces direct URI.open downloads in assemblies and participatory process importers with uploader remote_url assignment and set_content_type_and_size calls, wrapped in network-error rescue that logs attachment warnings and skips failed remote files; tests added for successful remote PDF/JPEG imports and edge cases. Changes
Sequence Diagram(s)sequenceDiagram
participant Importer
participant HTTP as Remote HTTP
participant Uploader
participant DB as Attachment Model
Importer->>HTTP: HEAD /file_url (fetch_content_type)
alt HEAD returns Content-Type
HTTP-->>Importer: Content-Type (e.g., application/pdf)
Importer->>Uploader: set remote_url = file_url
alt remote_url assignment succeeds
Uploader-->>Importer: set_content_type_and_size (content_type, size)
Importer->>DB: create/save Attachment (with content_type, size, weight)
DB-->>Importer: saved
else remote_url assignment fails (network error)
Uploader-->>Importer: raises network error
Importer-->>Importer: log attachment_error warning, skip file
end
else HEAD fails or no Content-Type
HTTP-->>Importer: error / nil
Importer->>Uploader: set remote_url = file_url (attempt)
Uploader-->>Importer: may succeed with unknown content_type
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (5)
decidim-assemblies/app/serializers/decidim/assemblies/assembly_importer.rb (2)
199-214: Add timeouts to prevent indefinite blocking.Same concern as flagged on
ParticipatoryProcessImporter#fetch_content_type— theNet::HTTPconnection here lacks explicitopen_timeoutandread_timeout, risking indefinite hangs on unresponsive servers.🛡️ Proposed fix
uri = URI.parse(url) http_connection = Net::HTTP.new(uri.host, uri.port) http_connection.use_ssl = true if uri.scheme == "https" + http_connection.open_timeout = 5 + http_connection.read_timeout = 5 http_connection.start do |http|🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@decidim-assemblies/app/serializers/decidim/assemblies/assembly_importer.rb` around lines 199 - 214, The fetch_content_type method currently creates a Net::HTTP instance without timeouts which can hang; update the Net::HTTP object (in Assemblies::AssemblyImporter#fetch_content_type) to set explicit open_timeout and read_timeout (e.g. open_timeout = 5, read_timeout = 10) before starting the connection, keep the existing use_ssl logic, and preserve the existing response handling and rescue behavior so unresponsive hosts time out instead of blocking indefinitely.
70-108: Same double-HEAD-request concern as inParticipatoryProcessImporter.The
remote_file_error(url)call at line 77 andfetch_content_type(url)at line 93 each issue a separate HTTP HEAD request to the same URL. See the recommendation on the parallelParticipatoryProcessImporterfor a merged approach that avoids the redundant round-trip.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@decidim-assemblies/app/serializers/decidim/assemblies/assembly_importer.rb` around lines 70 - 108, The importer makes two separate HEAD requests for the same URL (remote_file_error and fetch_content_type) causing redundant network calls; change import_folders_and_attachments to perform a single HEAD fetch (e.g. via a new helper fetch_remote_file_head or by updating remote_file_error to return both error and headers) and reuse its result: call that helper once per url, check the returned error and push a warning if present, and pass the retrieved content_type/header info into Attachment.new (content_type) instead of calling fetch_content_type again; adjust error handling around the single HEAD call to preserve the existing rescue behavior used when assigning attached_uploader(:file).remote_url.decidim-participatory_processes/app/serializers/decidim/participatory_processes/participatory_process_importer.rb (3)
227-241: Substantial code duplication withAssemblyImporter.
fetch_content_type,format_error,remote_file_error,attachment_title, and the bulk ofimport_folders_and_attachmentsare near-identical copies between this file anddecidim-assemblies/app/serializers/decidim/assemblies/assembly_importer.rb. This makes future maintenance error-prone — a fix in one importer can easily be forgotten in the other.Consider extracting the shared logic into a common module/concern (e.g.,
Decidim::Importers::RemoteAttachmentHandling) that both importers include. This could be done as a follow-up.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@decidim-participatory_processes/app/serializers/decidim/participatory_processes/participatory_process_importer.rb` around lines 227 - 241, The two importer classes (Decidim::ParticipatoryProcesses::ParticipatoryProcessImporter and Decidim::Assemblies::AssemblyImporter) duplicate methods like fetch_content_type, format_error, remote_file_error, attachment_title and large parts of import_folders_and_attachments; extract these shared behaviors into a single concern/module (suggested name: Decidim::Importers::RemoteAttachmentHandling) that implements those methods and any helper logic, then have both importers include that module and delegate to its methods (replace local method definitions in ParticipatoryProcessImporter and AssemblyImporter with includes/calls to the new module) so future fixes are made in one place.
92-130: Double HTTP HEAD request per attachment — consider mergingremote_file_errorandfetch_content_type.For each attachment,
remote_file_error(url)(line 99) issues an HTTP HEAD request, and thenfetch_content_type(url)(line 115) issues a second HEAD request to the same URL. This doubles the network round-trips for every imported attachment.Consider refactoring
remote_file_errorto return both the error and the content type from a single HEAD request, e.g., returning a struct/hash like{ error: nil, content_type: "application/pdf" }.♻️ Sketch of a merged approach
- error = remote_file_error(url) - if error.present? + file_info = remote_file_info(url) + if file_info[:error].present? `@warnings` << I18n.t( "decidim.participatory_processes.admin.imports.attachment_error", title: attachment_title(file), - error: + error: file_info[:error] ) next end Decidim.traceability.perform_action!("create", Attachment, `@user`) do attachment = Attachment.new( title: file["title"], description: file["description"], attached_to: `@imported_process`, weight: file["weight"], - content_type: fetch_content_type(url) + content_type: file_info[:content_type] )Where
remote_file_infodoes a single HEAD and returns both pieces of data.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@decidim-participatory_processes/app/serializers/decidim/participatory_processes/participatory_process_importer.rb` around lines 92 - 130, The import currently calls remote_file_error(url) and then fetch_content_type(url), causing two HTTP HEADs per attachment; change to perform a single HEAD by adding a combined helper (e.g., remote_file_info or remote_head_for_attachment) that returns both error and content_type (e.g., { error: ..., content_type: ... }) and update import_folders_and_attachments to call that helper once, use the returned :error to push warnings (preserving format_error handling) and the returned :content_type when building the Attachment (replace fetch_content_type(url) with the returned value); keep existing rescue logic around assigning remote_url and the attachment creation flow.
227-241: Missing timeouts on the HTTP connection — risk of indefinite blocking.
Net::HTTP.newwithout explicitopen_timeout/read_timeoutuses Ruby's default (which is effectively no timeout foropen_timeoutin older Rubies, and 60s forread_timeout). If the remote server is slow or unresponsive, this can block the import thread for a long time or indefinitely.The same concern applies to
remote_file_error(lines 175-176), but since that's pre-existing code, flagging it here for the new method.🛡️ Proposed fix
uri = URI.parse(url) http_connection = Net::HTTP.new(uri.host, uri.port) http_connection.use_ssl = true if uri.scheme == "https" + http_connection.open_timeout = 5 + http_connection.read_timeout = 5 http_connection.start do |http|🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@decidim-participatory_processes/app/serializers/decidim/participatory_processes/participatory_process_importer.rb` around lines 227 - 241, The fetch_content_type method creates a Net::HTTP connection without timeouts causing potential indefinite blocking; update fetch_content_type to set explicit open_timeout and read_timeout on the Net::HTTP instance created in URI.parse (e.g., when calling Net::HTTP.new(uri.host, uri.port)) and pass reasonable values (for example a few seconds) before calling http.start and http.head; ensure the same pattern (explicit open_timeout/read_timeout) is applied to the earlier remote_file_error HTTP usage to avoid the same blocking issue.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@decidim-assemblies/app/serializers/decidim/assemblies/assembly_importer.rb`:
- Around line 199-214: The fetch_content_type method currently creates a
Net::HTTP instance without timeouts which can hang; update the Net::HTTP object
(in Assemblies::AssemblyImporter#fetch_content_type) to set explicit
open_timeout and read_timeout (e.g. open_timeout = 5, read_timeout = 10) before
starting the connection, keep the existing use_ssl logic, and preserve the
existing response handling and rescue behavior so unresponsive hosts time out
instead of blocking indefinitely.
- Around line 70-108: The importer makes two separate HEAD requests for the same
URL (remote_file_error and fetch_content_type) causing redundant network calls;
change import_folders_and_attachments to perform a single HEAD fetch (e.g. via a
new helper fetch_remote_file_head or by updating remote_file_error to return
both error and headers) and reuse its result: call that helper once per url,
check the returned error and push a warning if present, and pass the retrieved
content_type/header info into Attachment.new (content_type) instead of calling
fetch_content_type again; adjust error handling around the single HEAD call to
preserve the existing rescue behavior used when assigning
attached_uploader(:file).remote_url.
In
`@decidim-participatory_processes/app/serializers/decidim/participatory_processes/participatory_process_importer.rb`:
- Around line 227-241: The two importer classes
(Decidim::ParticipatoryProcesses::ParticipatoryProcessImporter and
Decidim::Assemblies::AssemblyImporter) duplicate methods like
fetch_content_type, format_error, remote_file_error, attachment_title and large
parts of import_folders_and_attachments; extract these shared behaviors into a
single concern/module (suggested name:
Decidim::Importers::RemoteAttachmentHandling) that implements those methods and
any helper logic, then have both importers include that module and delegate to
its methods (replace local method definitions in ParticipatoryProcessImporter
and AssemblyImporter with includes/calls to the new module) so future fixes are
made in one place.
- Around line 92-130: The import currently calls remote_file_error(url) and then
fetch_content_type(url), causing two HTTP HEADs per attachment; change to
perform a single HEAD by adding a combined helper (e.g., remote_file_info or
remote_head_for_attachment) that returns both error and content_type (e.g., {
error: ..., content_type: ... }) and update import_folders_and_attachments to
call that helper once, use the returned :error to push warnings (preserving
format_error handling) and the returned :content_type when building the
Attachment (replace fetch_content_type(url) with the returned value); keep
existing rescue logic around assigning remote_url and the attachment creation
flow.
- Around line 227-241: The fetch_content_type method creates a Net::HTTP
connection without timeouts causing potential indefinite blocking; update
fetch_content_type to set explicit open_timeout and read_timeout on the
Net::HTTP instance created in URI.parse (e.g., when calling
Net::HTTP.new(uri.host, uri.port)) and pass reasonable values (for example a few
seconds) before calling http.start and http.head; ensure the same pattern
(explicit open_timeout/read_timeout) is applied to the earlier remote_file_error
HTTP usage to avoid the same blocking issue.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
decidim-assemblies/app/serializers/decidim/assemblies/assembly_importer.rbdecidim-assemblies/spec/serializers/decidim/assemblies/assembly_importer_spec.rbdecidim-participatory_processes/app/serializers/decidim/participatory_processes/participatory_process_importer.rb
b748147
acd2bd2 to
b748147
Compare
|
This is ready for the review @alecslupu |
alecslupu
left a comment
There was a problem hiding this comment.
I would have code change, that would reduce the spaghetti code. Tested on my local, and seems to be working fine.
decidim-assemblies/app/serializers/decidim/assemblies/assembly_importer.rb
Outdated
Show resolved
Hide resolved
decidim-assemblies/app/serializers/decidim/assemblies/assembly_importer.rb
Show resolved
Hide resolved
decidim-assemblies/app/serializers/decidim/assemblies/assembly_importer.rb
Outdated
Show resolved
Hide resolved
..._processes/app/serializers/decidim/participatory_processes/participatory_process_importer.rb
Show resolved
Hide resolved
f01a6be
|
This is ready for another round @alecslupu |
🎩 What? Why?
Yet another fix related to the export/import feature in spaces.
This time the problem is when the attachments are actually available (i.e. a status code 200). There's an error with the TMP file that we have as content type, when
ActiveStorageactually needs a Content Type for these files to be imported.📌 Related Issues
Testing
Stacktrace
📷 Screenshots
Summary by CodeRabbit
Bug Fixes
Tests