Skip to content

Database Index Patch#2585

Closed
joesenova wants to merge 2 commits intoYOURLS:masterfrom
joesenova:col_url_index
Closed

Database Index Patch#2585
joesenova wants to merge 2 commits intoYOURLS:masterfrom
joesenova:col_url_index

Conversation

@joesenova
Copy link
Copy Markdown

We ran into heavy load issues on our databases(AuroraSQL), we had a look and changed the Data Type of url column to VARCHAR(1024) from TEXT

TEXT is not a performance firndly data type.
I changed the version(s) app and db - to 1.7.4 and 483
The upgrade applies a patch on the database table yourls_url on column url
We haev notice a drastric decrease in db load.

Johan Kasselman added 2 commits November 28, 2019 11:24
…ook and changed the Data Type of url column to VARCHAR(1024) from TEXT

TEXT is not a performance firndly data type.
I changed the version(s) app and db - to 1.7.4 and 483
The upgrade applies a patch on the database table yourls_url on column url
We haev notice a drastric decrease in db load.
@dgw dgw added the database label Nov 28, 2019
@dgw
Copy link
Copy Markdown
Member

dgw commented Nov 28, 2019

This might not be such a good idea. URLs have no maximum length, so any upper bound on the size of that column would be asking for "bug" reports that YOURLS truncates links.

Even if this is something we want:

  • the maximum should be higher (2048 at least)
  • the schema change probably should wait until 1.8/2.0 (and it's not this PR's job to bump the version number anyway)

@ozh
Copy link
Copy Markdown
Member

ozh commented Nov 28, 2019

The reason why you see a reduced load is that you added an index on the URL field I think.

I chose TEXT because of the absence of practical limitation on URL length, knowing that this would be a trade-off with speed. I wouldn't recommend indeed going lower than 2048 as @dgw said, but this won't be possible as a general patch for YOURLS : 90% of people don't care about reducing load on their shorteners, and we've already had issues with too long URLs. We've had people for instance storing base64 encoded data URL.

This said, I think we can leave this discussion open because people better than me in SQL may have ideas on how to improve performance while keeping arbitrary long URLs

@joesenova
Copy link
Copy Markdown
Author

joesenova commented Nov 28, 2019

Hi yes, we see a reduction of load on the COU from average 60% to 6% on normal day to day activities.

The TEXT datatype is a very expensive datatype to do queries on. You can not add index on TEXT as it has no length ro index on. Everytime you do a query it scans all the text fileds. Therefore using a big varchar data type helps with indexing and query speed/performance.

What we can do is add a field in the admin part of the UI to customize thr size of the Varchar datatype. Size wise.

This will enable users to change the size on the fly.
If we can change the query ti not search in the urk fild bu rather based in an id or keyword that wiuld work better if you don't want to use varchar.
We currently have 4.8mil rows in the yourls_url table.

Once you start doing these queries you see a read total of 290K rows/s.

Anyway let see if we can come to a consolidated solution that will help with this load issue.

Then for the version update, I just did what I thought was necessary for the upgrade to work. I should only have changed the database version.

@joesenova
Copy link
Copy Markdown
Author

@stefan-kamsker

This pull request does indexing on the correct col to improve the DB performance and load on the server(s).

We have been running this code on our production environment for the past 5 months and working like a charm.

We have more than 10mil keys, I can also now use one ui for multiple DB's(SB's on the same server). And all is well now.

@stefan-kamsker
Copy link
Copy Markdown

@joesenova
thank you, will try it
BR

@ozh
Copy link
Copy Markdown
Member

ozh commented Apr 9, 2020

I'm closing this PR because, due to the limitation of URL length, it won't go into core.

This said, I think there may be a different approach: it's possible to index TEXT if you specify the key length. It won't give a unique index but I'm curious to see the benefits. See https://stackoverflow.com/a/2889835/36850

Is this something you've considered @joesenova ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants