Replies: 8 comments 10 replies
-
|
Could you clarify what exactly about the Analytics dashboard needs to be hidden, and what specific GDPR obligation you believe it triggers? Looking at the arguments listed:
Data minimization doesn't apply here. The data already exists. Open WebUI already logs it. From a legal standpoint, hiding the Analytics dashboard doesn't change anything. You are already processing the data, and are already under GDPR as-is. The dashboard itself doesn't change that. And "Data minimization" doesn't fit for a couple of reasons:
Least privilege is also not the right framing here. It is well established in the docs and our security guidelines that an admin can do absolutely everything. If you don't want the admin to see chats, turn off the env var that allows access to the user's chats. If you don't want the admin to be able to export the database, there's an env var for that too. Same for the BYPASS ADMIN ACCESS CONTROL env var for models, prompts, and knowledge bases. As an admin, you can already tinker with models, connections, RAG settings, export all configuration options, edit a user's password, see the user's email, profile picture, oauth, name, and their full profile. To clarify what the Analytics dashboard actually shows: you do not get access to chat content, as claimed in your issue. And you don't get access to per-chat metadata either. It is aggregated metadata over all chats of a user, and separately, data over how much a model is being used across the instance. So there's no single-chat metadata, no chat content and for chat content access specifically, there's already an env var to disable that. Defense in depth
It's not clear how adding a toggle to disable the Analytics dashboard, where nothing is configurable, lowers misconfiguration risk. And even if it did, defense in depth is about layered security controls, not about hiding a read-only analytics view. As a side note: 2 out of the 3 reasons listed aren't actually GDPR-related arguments, least privilege and defense in depth are general security principles. |
Beta Was this translation helpful? Give feedback.
-
|
I'm genuinely curious if there's a concrete reason the dashboard needs to be hidden, because if there isn't, it would be preferable to avoid adding yet another configuration option for something that isn't required for neither security nor legal compliance. The data is there either way. |
Beta Was this translation helpful? Give feedback.
-
This is already the case if you disabled https://docs.openwebui.com/reference/env-configuration#enable_admin_chat_access (which i assume you did?) - if this is disabled, no chats, no chat titles, no message excerpts, nothing of this is visible. |
Beta Was this translation helpful? Give feedback.
-
|
I have the usecase of a german worker council(Betriebsrat) asking for the option to disable per user analytics data for their approval in corporate usage. They have legal ways in german companies to block usage of tooling and in general oppose any possibilites to track per user data. |
Beta Was this translation helpful? Give feedback.
-
|
My use-case adds to the concerns expressed above. Might be a Germany (or EU) thing as a GDPR-sensitive environment. While I understand that the underlying data exists regardless of visibility, there’s a key difference between data being accessible on request (e.g., via API/query) and being permanently displayed in the UI. Unfortunately the Two straightforward solutions would help:
|
Beta Was this translation helpful? Give feedback.
-
|
Hi, I know about work councils, and I know that it may sometimes be difficult to push things through. But I am genuinely confused how the current metrics, as they are, could present an issue for "monitoring employees". As we have already established, you do not see what model an employee used. You do not see what the employee used the model for. You do not see when the employee used which model (though all of these things are technically easily retrievable via simple database queries). The only thing you see is: how many messages did an employee send and how many tokens exist in their account right now. And this number isn't even accurate: because if an employee deletes their chats after using the AI, the statistics get cascade deleted. The numbers you see in the admin panel are based on whatever is in the database at the time of viewing. If the employee deletes some or all of their chats, the statistics change accordingly. This is worth emphasizing: no monitoring system gives the monitored party a delete button. The fact that users can unilaterally make these numbers disappear means this is definitionally not a surveillance tool. It's like viewing the storage an employee consumes in their email inbox. Each employee has a fixed amount of storage. Does the amount of storage consumed equate to productivity or usage? Not at all. If the employee deletes older emails, storage goes down. If they never delete, it goes up. But storage is not a marker of productivity or behaviour, nor is it a meaningful measure of how someone uses the service. Someone consuming 50MB might send hundreds of emails a day. You don't anonymize the username in the email storage view either. You know exactly that Hans consumes 23 Gigabytes for his email inbox. There's no good reason to anonymize that - and nobody asks for it. So why anonymize the number of messages and tokens that currently exist in a user's account? Because that's all the analytics show: the current amount of messages and tokens across a user's chats at time of query. It doesn't equate to productivity. It doesn't equate to behavior. And if it would, then you would also have to anonymize the amount of storage consumed by an employee in their email inbox. The threshold under German labor law (§87(1) Nr. 6 BetrVG) is whether a technical system is designed to monitor employee behavior or performance. A volatile, user-deletable message count is justifyable as not being such a system. The values displayed by the system can change any second and because the employee can actively tinker with the number shown, you cannot infer any behaviour or performance metrics from it (and neither can you from consumed storage in the inbox). And in terms of GDPR, no additional obligation arises from displaying data that is already collected and processed for the purpose of providing the service (as we have already established). To add to this: I've verified the code path: when a user deletes a chat, the chat message table (which is what the analytics are based on) gets deleted first, before the chat record itself is even removed. And even if that operation were to fail, the chat has a CASCADE deletion with a foreign key constraint as a fallback. The system is explicitly designed to ensure analytics data does not survive chat deletion. When a chat is deleted (via Chats.delete_chat_by_id at backend/open_webui/models/chats.py:1494), the code does: Even if the explicit deletion were removed, the foreign key constraint on chat_message.chat_id at backend/open_webui/models/chat_messages.py:54-56 is defined with ondelete="CASCADE". This means the database itself would cascade-delete all chat_message rows when the parent chat row is removed. If I am missing anything here, since I am from Austria not Germany, which might have slightly different laws (though extremely similar in this area) then please let me know. But I cannot see any reason how this would be legally troublesome. Neither in terms of GDPR nor worker's rights. Edit: As per legal justification needed for such metrics; if admins need access to per-user storage consumption to plan for storage purchases, backup storage servers and hard drives, being able to adjust the storage quota of individual users (if Hans reaches 90/100GB you might want to increase specifically his storage), which are all equally legitimate administrator tasks which require seeing the total and per-user storage consumption, ... then.... logically, shouldn't the message and token count be too? Sure it is volatile because users delete their chats or use the temporary chats exclusively, or the admins even forcing them to exclusively use the temporary chats feature (which is a feature that exists, you could force them to always use temporary chats) - then why shouldn't an admin be able to see that? If you use custom filters to apply rate limits to how many tokens a user can use or how many messages he can send (aka an upper limit, like the email storage upper limit), then it would be needed to see how close specific users are to reaching that limit in order to adjust the rate limits specifically for those power users. There are many ways admins may justify having access to this data under §87 |
Beta Was this translation helpful? Give feedback.
-
|
@bk-lg @fabianh-rz @sct-hm #21651 vouches under the PR would help if you guys want it merged |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Check Existing Issues
Verify Feature Scope
Problem Description
Problem / Background
After upgrading to newer OpenWebUI versions (e.g. v0.8.3), the new Analytics (Analyse) feature provides a dashboard that shows all models and associated chats in one place.
In environments with multiple users and/or regulated requirements (e.g. EU), this is problematic from a GDPR / privacy-by-default perspective. Even if the page is intended to be admin-only, many organizations:
Actual behavior
Analytics dashboard is present and (from the UI) appears to aggregate and display model usage and related chats across the instance, which can conflict with GDPR/compliance expectations.
Why this matters (GDPR / compliance)
Version
Additional context
If there is already a way to disable this feature (env var, config flag, permission), please document it clearly in the docs and release notes.
Thanks!
Desired Solution you'd like
Expected behavior
Provide a documented, explicit configuration to disable Analytics entirely and/or limit it to privacy-safe aggregates.
Requested options (any of the following would help)
Hard disable switch
DISABLE_ANALYTICS=trueorENABLE_ANALYTICS=falseScope Analytics by user / tenant
Aggregation-only mode
Alternatives Considered
No response
Additional Context
No response
Beta Was this translation helpful? Give feedback.
All reactions