Nightbird, on 21 September 2021 - 06:17 AM, said:
I offered evidence that my simulation is built correctly by simulating Jay Z's PSR and determined what the performance was a year later. The prediction was checked a year later and was proven right. I used WLR instead of Jay Z's PSR in the same simulation, and this showed WLR performing much better. There's nothing cyclical about this argument.
On the other hand, you say the current match maker is good because you say it is good. There is no evidence for your statements.
In most cases you need to prove, that "success" of your MM isn't self-induced. What does it mean? When you make simulator, you usually don't have results of all real matches. You have to simulate them. And here is when you make some assumptions. Like "X difference in PSR between teams = K * X chance to win match". And this assumptions are usually about how MM
SHOULD WORK - not about how it
REALLY WORKS. This assumptions make it work properly, but does it mean, that it will work properly in real situation?
For example. Simple question. Do you take into account, that there is constant flow of players in this game? As I can see, you assume, that exactly the same players play this game together for 200k matches. Do you understand, that in reality it doesn't work this way? Or another example: do you understand, that different players can have different time zones, so they just can't play together?
And I also don't say, that with WLR-based MM matches wouldn't be balanced. May be they would. Problem is - WLR based MM measures team average and team balance isn't only thing, we need to achieve. We also need to achieve 1vs1 balance. Because essential goal of MM - isn't just eliminating match imbalance and stomps. It's goal - to provide fun gaming experience to all players. And WLR-based MM can't do it.
What WLR MM adepts don't understand - is that MM should measure and balance
PERSONAL skills. And simulating it is a little bit tricky. Because we need to make an assumption, that every player has some hidden "skill" variable and we need to measure it and then balance it. This is where problems with self-induced results start to appear. Your core assumption is that WLR is correlated with skill, so we need to measure WLR to measure skill. And of course MM, that makes sure, that WLR ~ 1 would be successful in this case. What if this assumption is faulty? Because there are many ways to achieve WLR ~ 1, including interleaving win/loss stomps. May be we should measure personal skill directly? Yeah, but it's hard to do it, because it's hard to define skill criteria.
And MS is good attempt to do it. Not 100% accurate. That's why it's mixed with WLR a little bit. But MS part is still required to insure, that 1vs1 balance is achieved - not just team average vs team average, that actually means nothing.
Edited by MrMadguy, 21 September 2021 - 11:30 PM.