Jump to content

The Last Match Maker Thread We Need


248 replies to this topic

#141 PhoenixFire55

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 5,725 posts
  • LocationSt.Petersburg / Outreach

Posted 13 June 2019 - 03:19 AM

View PostSjorpha, on 12 June 2019 - 01:51 AM, said:

Predicted contribution to win chance is the only relevant factor for matchmaking, if you define "skill" as something other than being good at winning more than losing, then whatever that is would indeed be completely irrelevant for matchmaking.

It works when you and only you defy your chances of winning, i.e. in 1v1s or for teams with fixed rosters. In MWO your W/L depends on the MM itself, because people who end up on your team are picked by MM. And we simply can't state that everyone is treated equally and resulting W/L is only a factor of your own contribution. Different tiers, different time zones, different amount of "relaxed" MM criterias and so on, some only play solo, some only in groups etc. I can only once again say, that if that wasn't the case and if it was really that simple, then Elo MM would have worked. Which it obviously didn't.

View PostFRAGTAST1C, on 12 June 2019 - 05:12 AM, said:

Would the KDR be a better statistic then?

*insert triple facepalm meme here*

#142 PhoenixFire55

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 5,725 posts
  • LocationSt.Petersburg / Outreach

Posted 13 June 2019 - 03:29 AM

View PostWil McCullough, on 12 June 2019 - 04:58 PM, said:

It doesn't matter because everyone is saddled with the same matchmaker. It's a constant. Obviously there will be short term spikes and slumps but in the longer run, your wlr always determines how much you effect a win compared to everyone else.

Everyone is, but what does a current MM use for example? ... PSR. And PSR is obtained differently by different people. Since we don't know the exact formula we can't figure out what we need to do specifically to farm/tank it, but there very well might be people who do exactly that by chance. And obviously players of roughly the same skill can end up with a very different PSR, and since MM is based on PSR it treats them differently. Farmed PSR is considered to be better than he actually is, thus in a more skill environment he'd have lower W/L than he is supposed to and visa versa. And since PSR gain isn't even centered to zero sum needless to say it creates a significant discrepancy between what we should have and what we actually have.

Current PSR MM is just an example tho of how MM algorithms do affect W/L. Sadly W/L in MWO isn't just the result of your contribution and your contribution alone. Otherwise Elo MM we had a few years back would have worked just fine.

#143 UnofficialOperator

    Member

  • PipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 1,493 posts
  • LocationIn your head

Posted 13 June 2019 - 04:03 AM

@Nightbird
I applaud your effort but why though... itz time 2 stahp

#144 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 13 June 2019 - 05:15 AM

View PostUnofficialOperator, on 13 June 2019 - 04:03 AM, said:

@Nightbird
I applaud your effort but why though... itz time 2 stahp


It's never time to stop criticizing PGI though?

View PostPhoenixFire55, on 13 June 2019 - 03:29 AM, said:

I say words but I don't offer proof


It's OK, most people are like that because there is no proof for what they say

Edited by Nightbird, 13 June 2019 - 05:19 AM.


#145 John Bronco

    Member

  • PipPipPipPipPipPipPip
  • The Fighter
  • The Fighter
  • 966 posts

Posted 13 June 2019 - 05:31 PM

good post

#146 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 14 June 2019 - 06:24 AM

You're 100% correct and I've been saying it for years.

HOWEVER

There will always be issues about players available in match creation window. How that then stacks with tonnage is going to be problematic.

In the end though an Elo style W/L based matchmaker will give much better results on average over the same number of matches.

I don't get why the concept is such a struggle for people. There is not and mathematically can not be a better source for predicting your impact on helping a team win a match than your historical precedent for helping a team win (or lose) a match.

You could then improve on that metric by creating an independent Elo for mech variant and loadout but honestly given the issue with players available in any given matchmaking window that's a lot more work for a relatively small return.

#147 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 14 June 2019 - 06:30 AM

View PostMischiefSC, on 14 June 2019 - 06:24 AM, said:

There will always be issues about players available in match creation window. How that then stacks with tonnage is going to be problematic.


Come on Mischief, you should know better than to make a claim without something to back it up... I've run my simulation with 1000 people, other than taking 10x longer to run, no difference in the results (as I expected). This is still taking only 24 players at a time. The tonnage of mechs and whether a person is proficient in piloting them is reflected in a player's W/L ratio as well

View PostMischiefSC, on 14 June 2019 - 06:24 AM, said:

In the end though an Elo style W/L based matchmaker will give much better results on average over the same number of matches.


If the teams at least have fixed rosters (i.e. can switch members out for games but members cannot play for other teams), you can assign an Elo to a team. However you cannot average the Elo for all the players and assign that to a team, because it absolutely will cause skilled players' Elos to go off to infinity. (Turn into Egos)

Edited by Nightbird, 14 June 2019 - 06:34 AM.


#148 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 14 June 2019 - 06:56 AM

View PostNightbird, on 14 June 2019 - 06:30 AM, said:


Come on Mischief, you should know better than to make a claim without something to back it up... I've run my simulation with 1000 people, other than taking 10x longer to run, no difference in the results (as I expected). This is still taking only 24 players at a time. The tonnage of mechs and whether a person is proficient in piloting them is reflected in a player's W/L ratio as well



If the teams at least have fixed rosters (i.e. can switch members out for games but members cannot play for other teams), you can assign an Elo to a team. However you cannot average the Elo for all the players and assign that to a team, because it absolutely will cause skilled players' Elos to go off to infinity. (Turn into Egos)


Dude you know I work in analytics too. I get the logic of your simulation and the results are consistent but in the realities of making matches you'll have pools of wildly mismatched player skill and tonnage. For example if out of a player pool with 7 'good' players, 5 of them in assaults, 1 light and 1 heavy and a bad mix of terribads in mismatched tonnage will make it very hard for the MM to build a good match in that particular window, until player population shifts.

You are absolutely correct that over a large enough sample of games however a W/L system will give better results. My point was only to remind everyone else that while on average and over a sample size dependent on how many players are on when you play you'll get better matches you're still going to get some mismatched experiences. Just not as many.

Elo is a w/l matchmaker. Only thing that makes it an Elo system is using a variable to adjust your point gain/loss based on the caliber of who you play. So if it makes a mismatched game based on limited players available it avoids big changes in their results. So what you're describing in your original post is absolutely an Elo system, just that to avoid what you're talking about you adjust the K factor down sharply as one gets close to 2000.

Because of variance in tonnage and mech performance it's very, very hard to properly replicate what an Elo should do for players wildly deviant from average. Again, you could probably tweak that via some form of modifier based on what you're playing but I don't think we have the depth of population to make that worthwhile. What we really want to do is functionally shrink and cap Elo growth closer to 2000 instead of 2400, so only a handful of players will get over 2000 and the 1 or 2 best players make it to 2400. This helps avoid what you're talking about where once you make it over 1800 you're almost certain to inflate forever.

So, to reiterate, what you're describing is an Elo system just one with a very low K factor, which I agree with. Just that players need to realize that if you play Oceanic timeframe you're going to have more mismatches - however that doesn't mean the MM is bad because on the average you're still going to have just as accurate of a w/l representation just that it'll take more total matches to reflect that than if you play NA primetime.

#149 The6thMessenger

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Nova Captain
  • Nova Captain
  • 8,104 posts
  • LocationFrom a distance in an Urbie with a HAG, delivering righteous fury to heretics.

Posted 14 June 2019 - 07:08 AM

View PostNightbird, on 14 June 2019 - 06:30 AM, said:

The tonnage of mechs and whether a person is proficient in piloting them is reflected in a player's W/L ratio as well


If it's about the mech class, we can always just refer to the WLR by weight-class. I mean current Leaderboard do have a division between weight-classes.

#150 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 14 June 2019 - 07:08 AM

View PostMischiefSC, on 14 June 2019 - 06:56 AM, said:

For example if out of a player pool with 7 'good' players, 5 of them in assaults, 1 light and 1 heavy and a bad mix of terribads in mismatched tonnage will make it very hard for the MM to build a good match in that particular window, until player population shifts.


Would you rather have an assault with 0.5 WLR on your team or a light with 2WLR? *shrug*

View PostMischiefSC, on 14 June 2019 - 06:56 AM, said:

Elo is a w/l matchmaker.


Only if you are tracking Elo by individuals or fixed teams. What in Elo allows the Elo of a team to be calculated by the average Elo of the players?

#151 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 14 June 2019 - 07:12 AM

Suppose you have two teams, one with two Elo 1500 players, another with Elo 1000 and Elo 2000 players. Both teams average 1500 Elo, they fight, and win or lose, all players on either team gain or lose 100 pts (based on the K value and the team's equal Elo values). The two problems are, as the simulation shows, the Elo 2000 player will on average win >50% of all matches forever, will never reach an equilibrium. Secondly, since the point exchange is based on the team's Elo, high skilled players will gain or lose an equal number of points on average (MM makes teams with equal Elo). This sort of calculation is not Elo to start with, so I don't like calling it Elo, but basically high skilled players will go off to infinity and low skilled players to 0 (err negative infinity).

If you don't believe me, give me a K value, I can do a sim this weekend.

Edited by Nightbird, 14 June 2019 - 07:24 AM.


#152 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 14 June 2019 - 07:27 AM

View PostNightbird, on 14 June 2019 - 07:08 AM, said:


Would you rather have an assault with 0.5 WLR on your team or a light with 2WLR? *shrug*



Only if you are tracking Elo by individuals or fixed teams. What in Elo allows the Elo of a team to be calculated by the average Elo of the players?


You've played group queue enough to know that big tonnage mismatches can bridge a skill gap, especially in a pug environment where people may not be communicating. Again - I am 100% absolutely in agreement that on the average a w/l based system will give better results. I'm simply pointing out for everyone else that if you're playing in a smaller player pool (Oceanic timeframe) that your results in any given match will be more sensitive to individual player variables. i.e if there's literally only 3 good players on and 2 of them are drunk AF and leveling Corsairs you're going to have a very different experience than if all 3 are on practicing with their best mechs with game faces on prepping for league play. In NA primetime you've got a vastly larger pool so any individual players variance is less impactful on the whole in a given sample size.

So Elo is only tracked at a player level. However to build a match you are inevitably building matches with as close as possible of a match in 'score' (that's the Elo score, in fact the numbers you're using are literally Elo numbers) as possible.

All Elo does is frame W/L on a 0-2800, with 1400 being average. It has a K factor with is what decides how much your score moves on a win or a loss. So without a massive player base it's going to be impossible get perfect matches every game. Every game will have a variance between teams in Elo score. One team A may average 1500, the other team B 1425. In that instance first team is weighted higher and as such has better odds of winning. If team A wins they get X points and B loses X points. However if it's an upset then A loses Y points (a higher number because it means predictions were wrong) and B gains Y points.

The biggest issue we had before is that gain/loss of points was averaged at a team level and didn't taper. So because of low population a really good player (anyone much over 1800, because our population is mostly terribads) would perpetually inflate on average to 2800 because

So what you do is at 1800, Elo gain/loss should be cut in half. Then cut it in half again at 2000, and in half again at 2200, and again at 2400. You probably want to do the same for people under 1000. This prevents people wildly divergent from mean from inflating/deflating to functional end of spectrum. At least not without a truly stupid number of matches. That cut by 1/2 is just an example, it would probably more accurately be a 20/40/60/80 cut. You're trying to artificially create a human skill curve for the top 5-10% of players to help the MM keep a differentiation between a 'good' player and a 'great' player.

Then you reset everyones score every 12 months or so. Even if you just remove point variance from 1400 by 80% to 'seed' the next seasons results.

#153 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 14 June 2019 - 07:38 AM

View PostMischiefSC, on 14 June 2019 - 07:27 AM, said:




Which proves my point about fake average random team Elo... that it's not self stabilizing. (I described it as it explodes) Whereas WLR is stable and you don't need artificial resets because all above average players will be at the point max cap after some games, and all below average players at 0. This is NOT HOW ELO IS SUPPOSED TO WORK. You shouldn't need to put in more point earned adjustments into Elo, since Elo is itself a way to assign points based on skill differences. If you don't like how it is assigned, you're throwing Elo away entirely.

In the short term changes cases (drunk AF you put it), Elo is not going to do any better, but my simulation accounts for match-to-match skill variations anyways and includes that in the graph of match qualty. See my post on dice.

Edited by Nightbird, 14 June 2019 - 07:52 AM.


#154 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 14 June 2019 - 08:43 AM

View PostNightbird, on 14 June 2019 - 07:38 AM, said:


Which proves my point about fake average random team Elo... that it's not self stabilizing. (I described it as it explodes) Whereas WLR is stable and you don't need artificial resets because all above average players will be at the point max cap after some games, and all below average players at 0. This is NOT HOW ELO IS SUPPOSED TO WORK. You shouldn't need to put in more point earned adjustments into Elo, since Elo is itself a way to assign points based on skill differences. If you don't like how it is assigned, you're throwing Elo away entirely.

In the short term changes cases (drunk AF you put it), Elo is not going to do any better, but my simulation accounts for match-to-match skill variations anyways and includes that in the graph of match qualty. See my post on dice.


I get absolutely what you're saying. I'm saying that Elo is exactly what you're describing. You're describing an Elo system. The reason you use an Elo system is just to speed seating and to deal with variance in things like total matches played. You're creating an Elo style system - you're staring new players at 1400 and scaling them up and down based on wins and losses.

All the K factor does is speed up seating players with a low number of matches and try to level out the impact of the inevitable mismatches that happen over time.

So if I make a new account and play 10 matches and win all 10, my w/l is 10.0. I'm in trial mechs, the MM shouldn't take me at a 10 wlr.

The point of a K factor is that it's adjustable. The point of any MM is to accurately build matches with as close to matched as possible. To do that it has to try and accurately represent players on the human skill curve to accurately seat them. This means it needs some help differentiating players on both ends of the curve who are significantly devian from mean.

If you don't want to call it Elo that's fine, call it whatever you want. However to account for difference in total matches played (to deal with wide gaps in sample sizes for each players relative score) and create accurate gaps in representation between skill for players wildly deviant from mean in a small population so the MM can effectively differentiate between a skill 2000 and 2200, which is hard to do with small populations (especially within a given matchmaking windows), you need a tool to accurately adjust each players relative score.

You do that by adjusting point value each match on a scale (the 0 to 2800 in Elo for example) and you need to tweak it based on who they play against.

#155 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 14 June 2019 - 08:57 AM

View PostMischiefSC, on 14 June 2019 - 08:43 AM, said:

If you don't want to call it Elo that's fine, call it whatever you want. However to account for difference in total matches played (to deal with wide gaps in sample sizes for each players relative score) and create accurate gaps in representation between skill for players wildly deviant from mean in a small population so the MM can effectively differentiate between a skill 2000 and 2200, which is hard to do with small populations (especially within a given matchmaking windows), you need a tool to accurately adjust each players relative score.


Yep, and my argument is that the WLR ratio in an environment with random teams will present different skill levels precisely. (See the detailed player stats with the WLR MM simulation) If you would like to present 1.0 WLR as 1500, 1.5 WLR as 2800, that's fine by me.

View PostNightbird, on 12 June 2019 - 08:15 AM, said:


Now's probably a good time to address this, I didn't want to earlier since I was afraid the conversation would get complex.

The ideal simulation would be, for each player, there is a starting skill rating and a max skill rating and a learning speed. The learning speed would be the number of games needed to go from the starting skill rating to the max, linearly is probably fine. The learning speed could be 1000 for fast learners, 5000 for slow ones.

In addition, as the simulation runs, new players would join the player pool and some players would retire from it.

With this new simulation, what would be MM need to be like to maximize match quality? One can play with the MM and get results.

Edited by Nightbird, 14 June 2019 - 09:50 AM.


#156 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 15 June 2019 - 08:10 AM

View PostNightbird, on 14 June 2019 - 08:57 AM, said:


Yep, and my argument is that the WLR ratio in an environment with random teams will present different skill levels precisely. (See the detailed player stats with the WLR MM simulation) If you would like to present 1.0 WLR as 1500, 1.5 WLR as 2800, that's fine by me.


Functionally, yeah. More to the point you need a scale to put it on and you need some sort of K factor to create the learning speed you're talking about. You need to accelerate new account seeding and slow down advanced accounts to correctly replicate their segment of the population.

Also a 1.5 w/l for someone who plays in Oceanic may not be comparable to a 1.5 w/l for someone who plays in NA primetime. Admittedly population is low enough that it's not really a big factor in the overall averages.

#157 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 15 June 2019 - 08:47 AM

View PostMischiefSC, on 15 June 2019 - 08:10 AM, said:


Functionally, yeah. More to the point you need a scale to put it on and you need some sort of K factor to create the learning speed you're talking about. You need to accelerate new account seeding and slow down advanced accounts to correctly replicate their segment of the population.

Also a 1.5 w/l for someone who plays in Oceanic may not be comparable to a 1.5 w/l for someone who plays in NA primetime. Admittedly population is low enough that it's not really a big factor in the overall averages.


The learning speed I was talking about is just the true skill rating being fed into the simulator. As a person joins MWO, they start at the initial skill level, learning speed is how many games it takes them to reach max skill level. This is invisible to the MM and not scored. Also, W/L is automatically accelerated, first games weigh more and rest weigh less.

Regional differences also exist regardless of the MM in place, but since there are overlaps in zones where they play each other, you can see it as one. If the ones you're playing against, played against people on the other side of the world, you're being impacted by them as well in the MM.

Edited by Nightbird, 15 June 2019 - 08:48 AM.


#158 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 15 June 2019 - 12:25 PM

View PostNightbird, on 15 June 2019 - 08:47 AM, said:


The learning speed I was talking about is just the true skill rating being fed into the simulator. As a person joins MWO, they start at the initial skill level, learning speed is how many games it takes them to reach max skill level. This is invisible to the MM and not scored. Also, W/L is automatically accelerated, first games weigh more and rest weigh less.

Regional differences also exist regardless of the MM in place, but since there are overlaps in zones where they play each other, you can see it as one. If the ones you're playing against, played against people on the other side of the world, you're being impacted by them as well in the MM.


Sorta. Think of it like divisions - if you only ever play in Div B your perfect win rate doesn't count as much vs a Div A team. Even if the teams you're beating have played with and against Div A teams in the past that doesn't directly translate.

However this is really only going to impact a fraction of a percent of players and be almost invisible to average results anyway so fair enough.

The advantages of using a K factor to speed seating and smooth imbalances created by matchmaking errors is largely there to deal with trying to get more precise results in fewer matches. As MWO can see 10+ matches per player per day these issues will correct with growing sample size anyway. You probably could do without a K factor and just fly it on straight W/L. It's still going to have less reliable results for players with fewer matches but it's not like we've got a ton of new players anyway and lack of good mechs and builds available is probably a far bigger factor on performance than position on the skill curve for people with sub 500 matches too.

Plus, on the big plus side, this would be super easy to implement as there's no formula to tweak and it's using data PGI already gathers. FFS maybe we can get them to split pug/group queue stats while we're at it.

#159 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 15 June 2019 - 03:15 PM

View PostMischiefSC, on 15 June 2019 - 12:25 PM, said:

Sorta. Think of it like divisions - if you only ever play in Div B your perfect win rate doesn't count as much vs a Div A team. Even if the teams you're beating have played with and against Div A teams in the past that doesn't directly translate.


Bad example because regions are not intentionally fixed like divisions are. Some people in region A plays against those in B against those in C against those in A. Another reason it's a bad example is the skill seen in comp play should not be viewed as a systematic difference in skills like you are seeing. The main difference is in the population, the top 12 players in a region with 10,000 players will average a lot higher than a region with 1,000, simply because with a normal distribution of skills, the top will be more standard deviations away from the mean.

View PostMischiefSC, on 15 June 2019 - 12:25 PM, said:

The advantages of using a K factor to speed seating and smooth imbalances created by matchmaking errors is largely there to deal with trying to get more precise results in fewer matches. As MWO can see 10+ matches per player per day these issues will correct with growing sample size anyway. You probably could do without a K factor and just fly it on straight W/L. It's still going to have less reliable results for players with fewer matches but it's not like we've got a ton of new players anyway and lack of good mechs and builds available is probably a far bigger factor on performance than position on the skill curve for people with sub 500 matches too.


Elo is only proven to do what you say in a controlled environment with fixed teams or individuals, fixed match schedules, with fixed opponents. In a random team environment, there is no proof for these claims. If you'd to make some proof, I look forward to reading it,

My suggestion is still to not bother starting with Elo, the assumptions that lead to it working simply don't exist in MWO, better to start from scratch.

Edited by Nightbird, 15 June 2019 - 04:16 PM.


#160 TheArisen

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 6,040 posts
  • LocationCalifornia

Posted 17 June 2019 - 04:49 PM

Well every problem will have various nuance and details to be dealt with but imo I can't see how PGI at least trying out NB's idea here could hurt the game and there's a reasonable chance it'd at least make things better even if it's not perfect.

Obviously there's no reason to blindly go for it but just compare NB's idea to what we have now. IMO it's at least worth further/official PGI testing.





26 user(s) are reading this topic

0 members, 26 guests, 0 anonymous users