Is your feature request related to a problem? Please describe.
All Decidim installations have a functionality for downloading open data in CSV format, to facilitate the study of the platform's public content for researchers who do not have a high technical level.
Since there has been a constant development adding new features, there are cases where some data is not published in this file. Another shortcoming of this functionality is that it is not documented what each of the files corresponds to, as well as the headers (columns) of the CSVs.
Moreover, data is provided with all-locale sub-columns. If there are 4 different languages for the platform, there are 4 columns, with a single one filled, the one in which the object was written (for example in proposals you have tittle/ca tittle/eng or body/ca body/eng).
- Currently the Open data files contains:
proposal_comments
proposals
meeting_comments
meetings
projects
results
result_comments
This is closely related to #13248, as it is similar, but the focus is different:
Metadecidim proposals
Part of this improvements come from: https://meta.decidim.org/processes/roadmap/f/122/proposals/17207
Describe the solution you'd like
Add new fields to the Open Data file, so it’s updated with all the participation data that the platform generates.
Additionally, these files and fields will be documented so that they can be understandable to researchers when downloaded, without having to study the platform in-depth. Concretely we will do this:
- Add a README file to the export file, this text file should have a definition of all the fields that are download.
- Add to the Open Data file:
- Debates and debates_comments
- Surveys public answers (If we allow to publish answers)
- Metrics and statistics:
- Taxonomies metrics (How many resources has a taxonomy linked)
- Statistics and metrics of the organization (Metrics shown in the admin dashboard)
- If there are different languages enabled, all the data will be on my locale language, avoiding having different columns like
tittle/ca tittle/eng and, instead having one only column tittle . However, we need to show the language in which the participant has answered.
- In components: Only a column should be shown with the text input of the participant.
- In spaces: Only download the information in my locale language.
Describe alternatives you've considered
We could add other formats for the ZIP file (as JSON) but we prefer to have leave it easy for researches without advanced computer science proficiency to know how to work with the information from these files. For people with programming and advanced usage we already have the GraphQL API (/api).
Additional context
As mentioned in the description, this is complementary with #13248.
Does this issue could impact on users private data?
In survey public answers could have an impact. As some answers may be personal data.
Acceptance criteria
Is your feature request related to a problem? Please describe.
All Decidim installations have a functionality for downloading open data in CSV format, to facilitate the study of the platform's public content for researchers who do not have a high technical level.
Since there has been a constant development adding new features, there are cases where some data is not published in this file. Another shortcoming of this functionality is that it is not documented what each of the files corresponds to, as well as the headers (columns) of the CSVs.
Moreover, data is provided with all-locale sub-columns. If there are 4 different languages for the platform, there are 4 columns, with a single one filled, the one in which the object was written (for example in proposals you have
tittle/catittle/engorbody/cabody/eng).proposal_commentsproposalsmeeting_commentsmeetingsprojectsresultsresult_commentsThis is closely related to #13248, as it is similar, but the focus is different:
Metadecidim proposals
Part of this improvements come from: https://meta.decidim.org/processes/roadmap/f/122/proposals/17207
Describe the solution you'd like
Add new fields to the Open Data file, so it’s updated with all the participation data that the platform generates.
Additionally, these files and fields will be documented so that they can be understandable to researchers when downloaded, without having to study the platform in-depth. Concretely we will do this:
tittle/catittle/engand, instead having one only columntittle. However, we need to show the language in which the participant has answered.Describe alternatives you've considered
We could add other formats for the ZIP file (as JSON) but we prefer to have leave it easy for researches without advanced computer science proficiency to know how to work with the information from these files. For people with programming and advanced usage we already have the GraphQL API (/api).
Additional context
As mentioned in the description, this is complementary with #13248.
Does this issue could impact on users private data?
In survey public answers could have an impact. As some answers may be personal data.
Acceptance criteria
When I download the Open Data file
Then, I get all the CSV files and a README file with the definition of all the concepts downloaded, so I can easily understand the data. -> Add README with explanation in Open Data zip #13435
When I open the file
Then I get a CSV file for:
When I I downloaded the Open Data File
Then I should see only a column for the fields that are multi-language (like
tittlebodylocation, etc.)