<godarklight[m]>
I should bench the one where weigh result is removed
<godarklight[m]>
I can't imagine it would make a big difference unless sqlalchemy is lazy loading
<DasSkelett[m]>
Well okay, `weigh_result()` does access `Mod.created` once, which was one of the newly indexed columns. But that's only a part of why `search_mods()` is so slow.
Darklight[m] has quit [Quit: Client limit exceeded: 6]
HebaruSan[m] has joined #spacedock
<HebaruSan[m]>
Yeah separate tests of the score PR would be nice
<HebaruSan[m]>
Yeah separate tests of the score PR would be nicve
<DasSkelett[m]>
sqlalchemy is lazy-loading, as in it only requests the rows when it finally needs them. I spent some time confirming this myself.
<godarklight[m]>
I mean in terms of "each mod in weigh results is another trip to the database" sense
<DasSkelett[m]>
The current implementation does fetch all rows with all columns in one go, saves them in one big array and then iterates over it to calculate the scores.
<godarklight[m]>
I thought I confirmed that wasn't the case, let me check ;)
<HebaruSan[m]>
Doesn't the sqlalchemy stuff do some on-demand loading? I thought that was why #276 failed
<HebaruSan[m]>
I.e., I was storing lazily loaded thunks in an array and trying to use them out of their proper context
<DasSkelett[m]>
Well okay, maybe sqlalchemy doesn't load all rows at once in this case, but they're still stored in one big Python array/list in the end. Otherwise `sorted()` couldn't work.
<HebaruSan[m]>
Yeah, sorting the whole array by definition requires loading all the data
<HebaruSan[m]>
Your PR does make it so we don't need to load every row in the db tho
<HebaruSan[m]>
So on-demand loading can still make sense
<DasSkelett[m]>
That's acceessing every single mod version there is, just to get the count of them, because we are using `len()` instead of `.count()`:
<godarklight[m]>
Popular has to load all the mods and then uses the weighted_results thing
<HebaruSan[m]>
Hmm, I may be thinking of your PR as already having `index=True`
<HebaruSan[m]>
Are we planning to go with that?
<godarklight[m]>
That's up to vitas
<HebaruSan[m]>
No, I mean the score PR, separate from mine
<godarklight[m]>
I think I want to test the score pr
<HebaruSan[m]>
Currently I think it's not indexed, so it gets a performance boost by moving the sort from Python to postgres
<HebaruSan[m]>
With an index the sort would be free
<godarklight[m]>
The "not calling the database 10000 times per page" also would help, if it does that :P
<DasSkelett[m]>
Okay guys, slowly. <span class="d-mention d-user">HebaruSan</span> yes, I would go with `index=True`.
<DasSkelett[m]>
@godarklight, look at the log, that's not fetching every mod one for one (which indeed happens in a single call, right at the top), the spammy part is loading all mod **versions** of one mod after another, just to get the count (see my linked code lines).
<godarklight[m]>
Ah ok
<DasSkelett[m]>
*edit:* ~~Okay guys, slowly. <span class="d-mention d-user">HebaruSan</span> yes, I would go with `index=True`.
<DasSkelett[m]>
@godarklight, look at the log, that's not fetching every mod one for one (which indeed happens in a single call, right at the top), the spammy part is loading all mod **versions** of one mod after another, just to get the count (see my linked code lines).~~ -> Okay guys, slowly. <span class="d-mention d-user">HebaruSan</span> yes, I would go with `index=True`.
<DasSkelett[m]>
@godarklight, look at the log, that's not fetching every mod one for one (which indeed happens in a single call, right at the top), the spammy part is loading all mod **versions** and **media** of one mod after another, just to get the count (see my linked code lines).
<godarklight[m]>
I don't know if celery did the precalculate though, but the order shouldn't matter for speed
<DasSkelett[m]>
You have to wait 10 Minutes as of the current commit
<godarklight[m]>
That's like a 1000x increase right? :P
<DasSkelett[m]>
Pffft
<godarklight[m]>
Heh, maybe not that much, but yeah, I like this hehe
<DasSkelett[m]>
When we want to use 5 days (or preferably 1 day as Rocky suggested), we have to find a way to trigger it once after startup.
<HebaruSan[m]>
Alembic?
<DasSkelett[m]>
After restarts too
<HebaruSan[m]>
It's *kind of* a data migration thing
<godarklight[m]>
I'd do it every at least hourly, it doesn't matter because I assume the function will run quickly enough (I assume 20 seconds?)
<godarklight[m]>
And you can trigger it from a oneshot in the flask router probably
<DasSkelett[m]>
It has also be triggered after normal server restarts, not only after database migrations (which it does anyway).
<DasSkelett[m]>
But I'd like to only trigger it once, not 6 times for every gunicorn instance. Is this possible in Flask?
<godarklight[m]>
Does flask have somewhere obvious to put it, is there something like after_request? where we can call the celery function
<HebaruSan[m]>
Why after a restart? Wouldn't the previous values still be in the db?
<DasSkelett[m]>
*edit:* ~~But I'd like to only trigger it once, not 6 times for every gunicorn instance. Is this possible in Flask?~~ -> But I'd like to trigger it only once, not 6 times for every gunicorn instance. Is this possible in Flask?
<godarklight[m]>
Maybe check a redis key
<DasSkelett[m]>
> Why after a restart? Wouldn't the previous values still be in the db?
<DasSkelett[m]>
Actually... You might have a point.
<HebaruSan[m]>
I think it would be a good fit for alembic if we can make it work. Get it done as part of the upgrade, before the server goes live
<godarklight[m]>
Also I forget how celery tasks are called but I think it is something to do with the .delay, my memory is hazy >.<
DasSkelett[m] has quit [Quit: Client limit exceeded: 6]
HebaruSan[m] has quit [Quit: Client limit exceeded: 6]
DasSkelett[m] has joined #spacedock
<DasSkelett[m]>
Mhh, looks like you can't do `.count()` on backrefs, so the ModVersion spam has to stay for now :/ At least it will happen in the background, so yeah.
godarklight[m] has quit [Quit: Client limit exceeded: 6]
HebaruSan[m] has joined #spacedock
<HebaruSan[m]>
I think we may be able to do something else about that
<Darklight[m]>
Did the eager load thing help ^, I am not nearly familiar enough with postgres
<Darklight[m]>
Oh
<Darklight[m]>
Hey I can test that, I will go back to master and add it in
<HebaruSan[m]>
Hmm, it does have `.join(Mod.versions)`, I must have been thinking of something else
<DasSkelett[m]>
Ouch, the alembic upgrade script will be quite big, since I not only have to provide a skeleton for the `Mod` table, but also for all the tables that jave any backrefs that we access during `get_mod_score()`.
<HebaruSan[m]>
Can't just call the existing code?
<DasSkelett[m]>
Hmm, maybe importing the classes of the other tables works, since they don't change during the upgrade.
Darklight[m] has quit [Quit: Client limit exceeded: 6]
HebaruSan[m] has quit [Quit: Client limit exceeded: 6]
<DasSkelett[m]>
Mmpf, now I can calculate the scores in alembic, however for whatever reason they don't get commited to the db (I do call `session.commit()`.
<DasSkelett[m]>
Mmpf, now I can calculate the scores in alembic, however for whatever reason they don't get commited to the db (I do call `session.commit()`).
HebaruSan[m] has joined #spacedock
<HebaruSan[m]>
I think maybe you have to do it through the `op` object
DasSkelett[m] has quit [Quit: Client limit exceeded: 6]
HebaruSan[m] has joined #spacedock
<HebaruSan[m]>
*edit:* ~~I think that's basically it, assigning to `mod.score` isn't going to do anything SQL-related if it's not defined as as column~~ -> I think that's basically it, assigning to `mod.score` isn't going to do anything SQL-related if it's not defined as a column
DasSkelett[m] has joined #spacedock
<DasSkelett[m]>
Now I just have to update the score everywhere we change something that affects it.
<HebaruSan[m]>
Mostly download
<HebaruSan[m]>
Want me to do that?
<DasSkelett[m]>
Oh no, shouldn't be take that long, thanks for the offer though.
<DasSkelett[m]>
But I'd welcome if you can check that I didn't miss anything after I pushed it ^^
<HebaruSan[m]>
The class def in #274 has the new column defined
<HebaruSan[m]>
Maybe that's what you meant
<DasSkelett[m]>
Uhm, hmm..
<DasSkelett[m]>
Tested moving the column declaration in to the class definition, and it does indeed work. Why do I remember getting an error there?
<DasSkelett[m]>
*edit:* ~~Oh no, shouldn't be take that long, thanks for the offer though.~~ -> Oh no, shouldn't take that long, thanks for the offer though.
HebaruSan[m] has quit [Quit: Client limit exceeded: 6]
Darklight[m] has joined #spacedock
<Darklight[m]>
I am a little confused, why is this not a column?
<DasSkelett[m]>
Why is what not a column?
godarklight[m] has joined #spacedock
<godarklight[m]>
Nevermind, I am only loosely reading, distracted elsewhere now >.<
<DasSkelett[m]>
Yeah it's probably easier to look at the changes in the PR after my next push.
Darklight[m] has quit [Quit: Client limit exceeded: 6]
HebaruSan[m] has joined #spacedock
<HebaruSan[m]>
Well on the upside this is definitely helping me to understand how sqlalchemy works
godarklight[m] has quit [Quit: Client limit exceeded: 6]