Jump to content

Stats Study: Matchmaker Is Unfair

Balance

344 replies to this topic

#1 Too Much Love

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 787 posts

Posted 19 April 2017 - 09:06 AM

Since the very beginning of MWO there has been an ongoing discussion about the matchmaker. Is it good or bad, is it balanced or biased, does it assemble equal or unequal teams?

Not long ago the players data became public via Leaderboard. It means that now we can examine how matchmaker really works - all we need to do is to compare the teams comparison and the players performance available in Leaderboard.

I made such an attempt and want to share results and my conclusions.

The method of the study.

1) I made the screenshots with the results of the game in the end of the match

2) Using Leaderboard I checked stats of every player who participated in the match (the stats of the current gaming season)

3) Then I calculated the average kill/death ratio, win/loss ratio and average matchscrore (MS) for the players of the victorious and defeated teams (of course, it was based not on the performance in this match, but in the whole season).

The scope of the study

I've analyzed 12 matches played in solo queue during 18-19 April, 2017.

The results

The study showed that in the overwhelming majority of cases the victorious team had an initial advantage. It consisted of players who had higher W\L, K\D and average MS. The opposing team had lower average W\L, K\D and MS.

In 6 matches the players of the team that gained victory had higher W\L, K\D and MS.
In 5 matches the players of the team that gained victory had higher performance among 2 of 3 stats (e.g. they had higher W\L and MS, but their K\D was lower).
In only 1 match the winners had lower average W\L, K\D and MS then the defeated team.

Posted Image

The conclusions

I've reinforced my impression that the outcome of the match is determined by the matchmaker. In fact matchmaker doesn't assemble the equal teams. It makes teams to be unequal. The one team is determined to win, the other - to lose.

Among 12 analyzed matches there was only one exception to this rule. In 90% of the matches the result could be easily predicted after examining of the players stats from Leaderboard.

The question is why matchmaker is programmed that way. Nobody expects the teams to be equal 100%. But the differences between the teams is sometimes striking.

For example in the match №1 the winners average K\D was 1.6, the losers - 0.92. W\L - 1.3 and 1.06, MS - 260 and 208 respectively.

In the match №2 the winners average K\D was 1.31, the losers - 0.91. W\L - 1.29 and 1.01, MS - 238 and 198 respectively.

This difference is really huge. Those 24 people could be mixed the other way to smooth it out, but instead matchmaker formed one "strong" and one "weak" team.

We can imagine some fantastic machine (that could be build by someone with programming skills) that can predict the result of the battle in the beginning of the match. The person takes screenshot of the participants, then this screenshot is scanned, the program redirects the names to the Leaderboard, calculates team's performance and give the result. In 90% (if not more) it would be correct.

I understand that 12 matches is not enough to make really representative sample and come to the firm conclusions. But I believe it shows the trend. I encourage other players, who want to spend time and effort, to make their own examine of the matchmaker.

The data that I used can be found here:
https://docs.google....xejU/edit#gid=0

P.S. Found out that Tarogato with the help of his teammates made resembling study. He used over 100 matches data and his results were quite similar.
Tarogato's study is here:
https://mwomercs.com...is-of-the-12-0/

Edited by drunkblackstar, 20 April 2017 - 01:09 AM.


#2 Bud Crue

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Rage
  • Rage
  • 9,943 posts
  • LocationOn the farm in central Minnesota

Posted 19 April 2017 - 09:17 AM

I guess the results are not exactly a shock. I mean it would have been a shock to me if the comparative stats on each team were remotely similar. But what you found? It's depressing but not surprising.

#3 Too Much Love

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 787 posts

Posted 19 April 2017 - 09:21 AM

View PostBud Crue, on 19 April 2017 - 09:17 AM, said:

I guess the results are not exactly a shock. I mean it would have been a shock to me if the comparative stats on each team were remotely similar. But what you found? It's depressing but not surprising.

The value of this study is that for the first time it shows that a lot of us felt but couldn't prove.

I can create 100 posts on the forum, crying that matchmakers makes unequal teams, and it would be in vain, because the other guy would say that it's OK and the team are balanced. It's all impressions, not more.

But this is not the emotions, it is stats and data.

Edited by drunkblackstar, 19 April 2017 - 09:22 AM.


#4 Appogee

    Member

  • PipPipPipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 10,966 posts
  • LocationOn planet Tukayyid, celebrating victory

Posted 19 April 2017 - 09:27 AM

I know it's weird, but it feels to me like win:loss ratio is still being used in the background to allocate people to teams.

I say that because my win:loss ratio cycles up to 1.1 then down to 1.01. Up to the top of that range, then down to the bottom again.

And it's been doing that for two years. It's as if when I get to 1.1 some kind of switch kicks in and I get a run of awful teams that not even Proton could carry.

I have no proof for this, other than that my win:loss stat seems like it hasn't behaved in a random fashion over the course of thousands of matches.

#5 Too Much Love

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 787 posts

Posted 19 April 2017 - 09:28 AM

View PostAppogee, on 19 April 2017 - 09:27 AM, said:

I know it's weird, but it feels to me like win:loss ratio is still being used in the background to allocate people to teams.

I say that because my win:loss ratio cycles up to 1.1 then down to 1.01. Up to the top of that range, then down to the bottom again.

And it's been doing that for two years. It's as if when I get to 1.1 some kind of switch kicks in and I get a run of awful teams that not even Proton could carry.

I have no proof for this, other than that my win:loss stat seems like it hasn't behaved in a random fashion over the course of thousands of matches.

This is exactly the thing I'm talking about.

#6 Mystere

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 22,783 posts
  • LocationClassified

Posted 19 April 2017 - 09:28 AM

Sigh! So much time, money, effort, and other very limited resources have already been spent on finding the "perfect" matchmaker, when random selection still seems to be the best and simplest option.

#7 Tibbnak

    Member

  • PipPipPipPipPipPip
  • Survivor
  • Survivor
  • 379 posts

Posted 19 April 2017 - 09:30 AM

It's possible the MM is designed on purpose to create intentional one-sided stomps, going along the idea that in a battle there needs to be a winner and a loser, PGI just wants to set that up ahead of time.

#8 Too Much Love

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 787 posts

Posted 19 April 2017 - 09:32 AM

View PostTibbnak, on 19 April 2017 - 09:30 AM, said:

It's possible the MM is designed on purpose to create intentional one-sided stomps, going along the idea that in a battle there needs to be a winner and a loser, PGI just wants to set that up ahead of time.

And that creates terrible gaming experience.

Edited by drunkblackstar, 19 April 2017 - 09:33 AM.


#9 Bud Crue

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Rage
  • Rage
  • 9,943 posts
  • LocationOn the farm in central Minnesota

Posted 19 April 2017 - 09:43 AM

View PostMystere, on 19 April 2017 - 09:28 AM, said:

Sigh! So much time, money, effort, and other very limited resources have already been spent on finding the "perfect" matchmaker, when random selection still seems to be the best and simplest option.


Yeah. Just play in group queue. That way you know how the rest of your evenings matches are going to go after 2 or 3 initial rounds. No match maker or comparative analysis needed. Once you see the other teams that are on you can pretty much guess the outcomes of every match you will play for the next few hours. Example: Oh look its 228 or SA or SIG...again. After the third time of running into such groups you know how its going to end. :)

#10 Savage Wolf

    Member

  • PipPipPipPipPipPipPipPip
  • The Wolf
  • The Wolf
  • 1,323 posts
  • LocationÅrhus, Denmark

Posted 19 April 2017 - 09:43 AM

View PostAppogee, on 19 April 2017 - 09:27 AM, said:

I know it's weird, but it feels to me like win:loss ratio is still being used in the background to allocate people to teams.

I say that because my win:loss ratio cycles up to 1.1 then down to 1.01. Up to the top of that range, then down to the bottom again.

And it's been doing that for two years. It's as if when I get to 1.1 some kind of switch kicks in and I get a run of awful teams that not even Proton could carry.

I have no proof for this, other than that my win:loss stat seems like it hasn't behaved in a random fashion over the course of thousands of matches.

Uhm... what you describe there is close to perfect. With a perfect matchmaker everyone should average about 1 in W/L ratio.

#11 Mcgral18

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • CS 2019 Top 8 Qualifier
  • CS 2019 Top 8 Qualifier
  • 17,987 posts
  • LocationSnow

Posted 19 April 2017 - 09:47 AM

View PostSavage Wolf, on 19 April 2017 - 09:43 AM, said:

Uhm... what you describe there is close to perfect. With a perfect matchmaker everyone should average about 1 in W/L ratio.


Sure...but not by throwing 6 Terribads on one side, and 2 on the other
Balancing the teams properly, not choosing a default side to lose (or massive carry)

#12 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 19 April 2017 - 09:49 AM

More likely it handles populations in batches, intentionally or not. A match from T1 ends, 9 players queue up again it splits them 5/4 into a new match. A t4 match ends, 18 queue back up - however they dont do it at once. The first to queue up gets placed first. So unless they queue up in order of skill they're not going into matches in order.

I'm not sure the MM reshuffles after it places people in a virtual lobby.

#13 El Bandito

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • Big Daddy
  • Big Daddy
  • 26,736 posts
  • LocationStill doing ungodly amount of damage, but with more accuracy.

Posted 19 April 2017 - 09:49 AM

I can usually tell the match result by checking unit tags. If I am grouped with guys from 228, or BCMC, and the other team is not, then even if the overall tier rating is the same, our side would be likely to win. I'm no slouch myself.

View PostTibbnak, on 19 April 2017 - 09:30 AM, said:

It's possible the MM is designed on purpose to create intentional one-sided stomps, going along the idea that in a battle there needs to be a winner and a loser, PGI just wants to set that up ahead of time.


Or maybe they are trying to enforce 1:1 WLR, in which case it is not working well.

Edited by El Bandito, 19 April 2017 - 09:53 AM.


#14 Deathlike

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • Littlest Helper
  • Littlest Helper
  • 29,240 posts
  • Location#NOToTaterBalance #BadBalanceOverlordIsBad

Posted 19 April 2017 - 09:51 AM

I don't think the MM is actually trying to construct the teams the way they do, but it's just the consequence of really bad metrics and telemetry that causes that to happen.

My basic example is pretty much this...

A Tier 1 max bar Lurm user CAN have the same "PSR rating" than some random player from Emp/SJR. Without even trying to figure out metrics and stuff, you know these two people ARE NOT OF THE SAME CALIBER OF SKILL. YET BY THE MAGIC OF THE MM'S ALGORITHM, they are considered ONE AND THE SAME.

That is all you need to really know about how bad the MM is. It doesn't consider stats... they consider the almighty PSR score as one and the same, even though anyone with a brain can tell you they are not.

#15 Savage Wolf

    Member

  • PipPipPipPipPipPipPipPip
  • The Wolf
  • The Wolf
  • 1,323 posts
  • LocationÅrhus, Denmark

Posted 19 April 2017 - 09:56 AM

View PostMcgral18, on 19 April 2017 - 09:47 AM, said:


Sure...but not by throwing 6 Terribads on one side, and 2 on the other
Balancing the teams properly, not choosing a default side to lose (or massive carry)

Apparently in his case it doesn't do that. Mine is also close to 1, so for me the matchmaker is working properly. But I'm also tier 3. Maybe it works less well in tier 1 where there is a greater span in skill. Who knows. All I can say is, the matchmaker is working for me.

View PostEl Bandito, on 19 April 2017 - 09:49 AM, said:

Or maybe they are trying to enforce 1:1 WLR, in which case it is not working well.

What!? That's exactly how it should work?

#16 SilentFenris

    Member

  • PipPipPipPipPip
  • Bridesmaid
  • Bridesmaid
  • 163 posts
  • LocationCalifornia

Posted 19 April 2017 - 10:00 AM

First - drunkblackstar, thanks for putting the time into this and posting/sharing. I love that you chose to use 3 indicators (Win/Loss, Kill/Death and Match Score) rather than just one.

Unstated Assumptions for the study's conculsion to be valid:

- Each pilot must be using a mech/build that they perform in compable to their Leaderboard stats. If they are testing a new build or trying something different "just for fun" it would make their Leaderboard stats irrelavant to that match.

- Pilot must be a primarly solo-queue player. Some unit/team players would have a "artifically pumped" Win/Loss stat if they play group queue more often than Solo queue.

- Both sides had equally proficent dropcallers. Easy to know for your team, hard to know for the enemy. The best strategist isn't always the one the team listens to. The "better" team can loose if an idiot but charismatic drop caller is running things. One order properly obeyed/executed can turn a whole match for better or worse.

- Both teams had mechs equally suited to the Map and Gametype choosen.

As far as the results, others have already said PGIs goal is 50/50 or a Win/Loss ratio of 1.00. Good business model or not, it is sustainable. Unfortunately it is not good for match QUALITY as others have also stated the Match Maker seems a little heavy-handed which results in more 12-0 stomps than quality matches.

*edit in italics for clarification

Edited by SilentFenris, 19 April 2017 - 12:04 PM.


#17 Old-dirty B

    Member

  • PipPipPipPipPipPip
  • 380 posts

Posted 19 April 2017 - 10:01 AM

Can you say something about how much better the winners won in relation to the advantage they had? 1.6 vs 0.92 KD, how much was the mechs were destroyed by the losers? Does more advantage translate to bigger stomps?

Btw, tnx for all the effort you put into this!

Edited by B3R3ND, 19 April 2017 - 10:01 AM.


#18 Too Much Love

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 787 posts

Posted 19 April 2017 - 10:04 AM

View PostDeathlike, on 19 April 2017 - 09:51 AM, said:

I don't think the MM is actually trying to construct the teams the way they do, but it's just the consequence of really bad metrics and telemetry that causes that to happen.
If it was as you propose, then the average stats of team 1 and team 2 would be relatively close.

For example, everything is decided by PSR system wich is far from being perfect, when the real pro who can carry has the same PSR as some LRM noob (in fact it is the way it works :) ). But it means that this player would be picked randomly among their Tier. If the sample of the players is random, then, I guess, the average stats between 2 teams wouldn't be that different.

View PostB3R3ND, on 19 April 2017 - 10:01 AM, said:

Can you say something about how much better the winners won in relation to the advantage they had? 1.6 vs 0.92 KD, how much was the mechs were destroyed by the losers? Does more advantage translate to bigger stomps?

Btw, tnx for all the effort you put into this!

Yes, we can:)

Check the data on google docs. The spreadshit contains the links to the screenshots.
https://docs.google....xejU/edit#gid=0

#19 Deathlike

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • Littlest Helper
  • Littlest Helper
  • 29,240 posts
  • Location#NOToTaterBalance #BadBalanceOverlordIsBad

Posted 19 April 2017 - 10:09 AM

View Postdrunkblackstar, on 19 April 2017 - 10:04 AM, said:

If it was as you propose, then the average stats of team 1 and team 2 would be relatively close.

For example, everything is decided by PSR system wich is far from being perfect, when the real pro who can carry has the same PSR as some LRM noob (in fact it is the way it works Posted Image ). But it means that this player would be picked randomly among their Tier. If the sample of the players is random, then, I guess, the average stats between 2 teams wouldn't be that different.


That's not how it works though.

You're assuming that say there are two actual high quality T1 players. Ideally, these 2 players would be distributed evenly for each side.


However, based on all of the MM discussions that PGI has been involved in, the teams constructed are borked.

Basically, the MM constructs a random team (let's just use the solo queue in this instance) AND then tries to find equal Tier players on the other side.

This means if those two players were queued up back to back AND the MM decides to pick them consecutively, they will be locked in together on the same side.

In other words, the MM is rigged in a way that it doesn't try to randomly generate/create a new group and try again... it's like "you're already stuck in the queue, so let's just 'try' to keep the order/group/arrangement because you're waiting the longest" instead of "let's try again with a different combination".

That's also what happens queue-wise.

Edited by Deathlike, 19 April 2017 - 10:11 AM.


#20 Too Much Love

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 787 posts

Posted 19 April 2017 - 10:09 AM

View PostSilentFenris, on 19 April 2017 - 10:00 AM, said:

First - drunkblackstar, thanks for putting the time into this and posting/sharing. I love that you chose to use 3 indicators (Win/Loss, Kill/Death and Match Score) rather than just one.

Unstated Assumptions for the study to be valid:

Yes, this model works for Solo queue primary. The group queue waits for it researcher:) IMHO it works almost the same way.

The devations to the stats and to the outcome are possible (grinding mech, pumped stats etc.), but they don't affect the general mechanics of matchmaker.





2 user(s) are reading this topic

0 members, 2 guests, 0 anonymous users