Conversation
|
Delete doesn't fully reclaim the disk space unless vaccum is triggered. Even after vaccum not all the disk space is reclaimed. That's why for the table videos it's better to truncate it because it will instantly reclaim the disk space. Also you should allow the ability to turn on or off these jobs because if one is running invidious multiple times this could lead to some funky behavior if 10 invidious processes are deleting the same thing. |
The problem of
Afaik, a simple I considered to add such an option, but I'm not sure if that's really required; None of the commands locks the DB, and they're not run very often (even with 20 instances running, they'd be ran 20 times per hour). This will indeed require some large scale testing. |
The issue on big instances is that autovacuum hammer the CPU like crazy, so doing a TRUNCATE avoid any CPU hammering by the autovacuum as the disk space is instantly reclaimed. I do agree with the cache purpose, but IMO the cache table is mildly useful because the cache table grows like crazy on big instances (6.7GB in 9 hours for me), and the video streams are often replaced with new one thus the rows cache are often updated. The current cache system needs to be revisited. I currently have a TRUNCATE every 12 hours, I can replace it with your query in order to see if that won't grow the DB too much.
People may prefer to return these queries outside of invidious, so I think an option is a good idea. |
Damn D:
Yup, I'm definitely looking closely at solutions like Redis.
I'm wondering if a more regular deletion would have the same "hammering" impact... If
Okay. I'll add that.
Ah, that's quite a lot of videos D: |
There was a problem hiding this comment.
I'm okay with adding the following functions:
- Invidious::Database::Videos.delete_expired: has been tested on my instance for 15 days and seems to work fine. The database size stay at around 5GB while having a hundred of new videos added to the table per seconds.
- Invidious::Database::Nonces.delete_expired: The table is empty on my side, but why not.
Not okay with the following ones:
- Invidious::Database::Channels.delete_not_subscribed: It adds too much load on the database as explained in https://github.com/iv-org/invidious/pull/3294/files#r961975096, or we only execute the query on instances with like < 100 users
Great! Did you notice any CPU/RAM usage?
I double checked: it only fills up if captcha are enabled on login/sign-up. I don't think yours has it enabled?
I'll entirely remove the code. The channels cache is also used for RSS iircs, and tbh, we don't keep much data anyway (the |
ea0aafa to
d6a7df3
Compare
Difficult to say because my servers are often overloaded during the night in EU time (something I plan to fix) but I haven't seen any spike in CPU usage from postgres autovacuum which is nice. So I think it's ok.
Indeed I deactivated the captcha for logins a while ago. Before merging the PR I still would like the ability to disable this job in the config. |
unixfox
left a comment
There was a problem hiding this comment.
Add the ability to disable the job in the config.
Sure, will do that tomorrow! |
aefd143 to
1580bab
Compare
Each job can define its own config options, that will be automatically added to the global YAML config structure. By default, all jobs get an "enable" property that can prevent a specific job from running, e.g if the instance admin wants to manage the database cleaning by themselves.
1580bab to
b4674ff
Compare
|
I should have fixed the CI now. |

These few code additions do the same as what we recommend in the documentation, but without the admin needing to add a cron job. The only cron that remains to the admin is the DB vacuuming (Interval is dependent on the instance's size)
Note: if said admin followed the "1 hour restart" recommendation, the job will only run once on every instance start. As many other jobs, they're here for small instances that don't have scheduled restarts.
Closes #2835
@FireMasterK This will probably help with the recent problems your instance has been facing.