-
Notifications
You must be signed in to change notification settings - Fork 130
Description
Description of the problem
This is a very large site with over 600,000 rows in the exp_files table.. So, these performance things probably aren't noticed by most people/most sites.. But, at this scale, just opening an entry to edit in the CP or saving it if there are any file fileds takes much longer than needed.
The issues seems to be related to the (long join) query that EE does with the exp_files.file_name in the where clause.. EE does a BINARY type on that query. Example:
WHERE ( File_files.model_type= 'File' AND BINARYFile_files.file_name= 'Day_two_-_1066.jpg' ANDFile_files.upload_location_id= '6' ANDFile_files.site_id = 1 ) LIMIT 1
That "BINARY" in the SQL where clause basically sets the comparison on the file_name to be case-sensitive. We did some testing, and ulimately, we changed the exp_files.file_name column from varchar 255 to varbinary 2040. (db string type fields: varchar, text, etc.) have a storage size in characters.. varbinary uses bytes.. So, I converted 255 characters to 2040 bytes, to ensure no existing data was truncated).
We also indexed that column since EE is using that field in "where" clauses of queries. Then we edited the following file for this:
/admin/ee/ExpressionEngine/Model/File/FileSystemEntity.php
commented out line 38 which sets the "file_name" column for the binary_comparison to remove the addition of the "BINARY" in there and we are seeing SUBSTANTIAL performance improvement.
A channel entry with a single file first party EE file grid field with 10 images..
Before changing:
opening to edit took about 6 seconds
saving took about 10 seconds.
After changes:
opening to edit to about 2 seconds
saving took about 3 seconds.
The BINARY sql statement forces a case-sensitive comparison.. BUT, for a file name column in the DB, there should never be a case-insensitive comparison needed to be made. AND since (at least on unix/linux) the file system is case-sensitive, it makes sense to just make that db column a varbinary.. then index it.. Then you can do just a standard SQL query/comparison, without forcing the BINARY type... since the DB already takes care of that..
To follow up with this.. After testing through our dev site.. We implemented these changes on our live site. A very robust server with 16 processors and 32 GB RAM..
A single channel entry with only 2 native EE grid file fields in it..
1 has 2 images, the other as 36..
Before these changes it took 24 seconds to just load the entry in the CP.. Now loads in 1.2 seconds.. It took 36 seconds to save the entry.. now saves in less than 3 seconds..
I dont' beleive the performance issues to be related to file size.. we tested with even small images do the same.. It's related to the exp_files table size, and how long the query takes to execute.
All this time, I thought it was just Ansel fields that were the bog-down in the Control Panel.. BUt, we've removed Ansel.. So, these are all native EE file grid fields and still saw the performance problems... which led us to this deep-dive testing and, seemingly, straight-forward solution.