Jump to content

The Last Match Maker Thread We Need


248 replies to this topic

#121 Wil McCullough

    Member

  • PipPipPipPipPipPipPipPip
  • The 1 Percent
  • The 1 Percent
  • 1,482 posts

Posted 12 June 2019 - 12:59 AM

View PostMrMadguy, on 12 June 2019 - 12:44 AM, said:

You forget about carrying. If you're good, but your team is bad - you lose. If you're bad, but your team is good - you win. Therefore WLR doesn't fully represent your PERSONAL skill. This idea isn't just assumption. It comes from my personal experience. When teams are balanced, but have, lets say, 2 good players, 4 mediocre and 4 bad players, it's that 2 good players, WHO DEFINE RESULT OF MATCH. No matter, how 4 bad players play - they will NEVER AFFECT RESULT OF MATCH. Result? May be bad players have WLR = 1. But their MMR rating isn't actually their. It's actually rating of players, who carry them.


I don't think you understand how statistics work.

#122 Sjorpha

    Member

  • PipPipPipPipPipPipPipPipPip
  • Philanthropist
  • Philanthropist
  • 4,478 posts
  • LocationSweden

Posted 12 June 2019 - 01:51 AM

View PostThe6thMessenger, on 11 June 2019 - 07:52 PM, said:

So if WLR is how valuable a player, and the MS is how much he earned, how is skill relevant again? No really, it's not a rhetorical question.

So WLR is where we could assume the potential of contribution, but AVG is the actual result of the contribution.

Can I assume that your MM isn't really about skill, but about potential of contribution? But if that's the case, why couldn't we just get on the actual result of the contribution because it's more indicative of better player?


Wlr is average contribution to winning (not potential, you can have a varying max potential with the same average).

Predicted contribution to win chance is the only relevant factor for matchmaking, if you define "skill" as something other than being good at winning more than losing, then whatever that is would indeed be completely irrelevant for matchmaking.

#123 East Indy

    Member

  • PipPipPipPipPipPipPipPip
  • The Hammer
  • The Hammer
  • 1,245 posts
  • LocationPacifica Training School, waiting for BakPhar shares to rise

Posted 12 June 2019 - 03:33 AM

View PostWil McCullough, on 12 June 2019 - 12:59 AM, said:


I don't think you understand how statistics work.

He's stating a little too broadly but has a point. W/L in a 12-man game with cascading effects depends on the matchmaker being somewhat accurate in the first place. Current state of the game can drop an excellent player into badly mismatched teams. Team loses, he does better than both teammates and most opponents, but W/L is suppressed.

#124 FRAGTAST1C

    Member

  • PipPipPipPipPipPipPipPipPip
  • The Fighter
  • The Fighter
  • 2,919 posts
  • LocationIndia

Posted 12 June 2019 - 05:12 AM

Would the KDR be a better statistic then?

#125 Sjorpha

    Member

  • PipPipPipPipPipPipPipPipPip
  • Philanthropist
  • Philanthropist
  • 4,478 posts
  • LocationSweden

Posted 12 June 2019 - 06:05 AM

View PostEast Indy, on 12 June 2019 - 03:33 AM, said:

He's stating a little too broadly but has a point. W/L in a 12-man game with cascading effects depends on the matchmaker being somewhat accurate in the first place. Current state of the game can drop an excellent player into badly mismatched teams. Team loses, he does better than both teammates and most opponents, but W/L is suppressed.

That's not true at all, in fact a working matchmaker would make everyone's wlr approach 1.0. In a way you can say wlr is more representative of skill in a bad matchmaker than in a good one. W/L is a long term stat and it isn't "suppressed" by specific mismatched games, over time the mismatching evens out and what's left is your own impact. Wlr would actually be more accurate as an indicator of skill if there was no matchmaking at all and teams were completely random.

The fact that a working matchmaker based on wlr would push everyone's wlr towards 1.0 (in fact that is the goal of any matchmaker) might become a problem over time though.

View PostFRAGTAST1C, on 12 June 2019 - 05:12 AM, said:

Would the KDR be a better statistic then?


Kdr isn't a very useful Stat for matchmaking in MWO as there isn't a strong enough correlation between surviving and winning.

#126 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 12 June 2019 - 06:32 AM

View PostEast Indy, on 12 June 2019 - 03:33 AM, said:

He's stating a little too broadly but has a point. W/L in a 12-man game with cascading effects depends on the matchmaker being somewhat accurate in the first place. Current state of the game can drop an excellent player into badly mismatched teams. Team loses, he does better than both teammates and most opponents, but W/L is suppressed.


Nope, no point anywhere... the second simulation shows that more skilled players have higher W/L ratio even when the MM is using WLR to make teams.

#127 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 12 June 2019 - 06:53 AM

View PostSjorpha, on 12 June 2019 - 06:05 AM, said:

Kdr isn't a very useful Stat for matchmaking in MWO as there isn't a strong enough correlation between surviving and winning.


Actually, it's equally good as MS. That's not a high level of 'good' but on average higher KDR and higher MS both gives higher WLR

Edited by Nightbird, 12 June 2019 - 06:54 AM.


#128 Sjorpha

    Member

  • PipPipPipPipPipPipPipPipPip
  • Philanthropist
  • Philanthropist
  • 4,478 posts
  • LocationSweden

Posted 12 June 2019 - 08:02 AM

View PostNightbird, on 12 June 2019 - 06:53 AM, said:


Actually, it's equally good as MS. That's not a high level of 'good' but on average higher KDR and higher MS both gives higher WLR


I would guess that the correlation between MS and winning is stronger than between kdr and winning, but I haven't actually looked at it so you may be right. Either way we agree none of them is very good basis for matchmaking so it's kinda whatever.

Edited by Sjorpha, 12 June 2019 - 08:03 AM.


#129 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 12 June 2019 - 08:15 AM

View PostKiiyor, on 08 June 2019 - 10:17 PM, said:

I'd be really interested in seeing if you could simulate skill improving and decreasing across your pilots - maybe having your pilots starting out lower skilled, then rising as they get better, and plateauing as they move up tiers and meet more skilled players - as I think that's where a lot of MM complaints come from. Say there's a new-ish player who finds a mech and build that agrees with them, stomps their way from T5 to T3 or T2, and hits an enormous wall as they find players better suited to exploit their weaknesses. Maybe a skill bell curve.


Now's probably a good time to address this, I didn't want to earlier since I was afraid the conversation would get complex.

The ideal simulation would be, for each player, there is a starting skill rating and a max skill rating and a learning speed. The learning speed would be the number of games needed to go from the starting skill rating to the max, linearly is probably fine. The learning speed could be 1000 for fast learners, 5000 for slow ones.

In addition, as the simulation runs, new players would join the player pool and some players would retire from it.

With this new simulation, what would be MM need to be like to maximize match quality? One can play with the MM and get results.

(Don't want to disappoint anyone though, so I'll say now I don't intent to put that much work into this thread. I wasn't joking when I said I charge 25K for this work)

#130 East Indy

    Member

  • PipPipPipPipPipPipPipPip
  • The Hammer
  • The Hammer
  • 1,245 posts
  • LocationPacifica Training School, waiting for BakPhar shares to rise

Posted 12 June 2019 - 08:41 AM

View PostSjorpha, on 12 June 2019 - 06:05 AM, said:

That's not true at all, in fact a working matchmaker would make everyone's wlr approach 1.0. In a way you can say wlr is more representative of skill in a bad matchmaker than in a good one. W/L is a long term stat and it isn't "suppressed" by specific mismatched games, over time the mismatching evens out and what's left is your own impact. Wlr would actually be more accurate as an indicator of skill if there was no matchmaking at all and teams were completely random.

The fact that a working matchmaker based on wlr would push everyone's wlr towards 1.0 (in fact that is the goal of any matchmaker) might become a problem over time though.

Maybe overstating things, there. The thing about MWO's W/L is that the curve looks screwy compared to verifiable -- not just statblock -- performance. In other words, I look at guys who I see frequently and know are not good at all as individuals or teammates, then compare to average but competent players, and the W/L difference is either close or in some cases the poor players surpassed the better ones season to season (with high confidence that Group Queue isn't skewing). Maybe the poor players dropped in something other than an LRM Atlas, or the better players stopped working toward wins. Maybe.

I'd want to see a curve plotted for wins to skill; based on observation and referencing, it appears to read well at total extremes but gets weird in between.

#131 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 12 June 2019 - 09:11 AM

View PostEast Indy, on 12 June 2019 - 08:41 AM, said:

Maybe overstating things, there. The thing about MWO's W/L is that the curve looks screwy compared to verifiable -- not just statblock -- performance. In other words, I look at guys who I see frequently and know are not good at all as individuals or teammates, then compare to average but competent players, and the W/L difference is either close or in some cases the poor players surpassed the better ones season to season (with high confidence that Group Queue isn't skewing). Maybe the poor players dropped in something other than an LRM Atlas, or the better players stopped working toward wins. Maybe.

I'd want to see a curve plotted for wins to skill; based on observation and referencing, it appears to read well at total extremes but gets weird in between.



On a season to season basis, you'll see a lot of variation due to the number of games played.


For people that want to understand roughly how I calculate the win/loss in the simulation:

Suppose the average player is a D6 die. Toss it once for a match and that's his performance. After a lot of throws, his average performance is 3.5, but for each match he can get 1 or 2 or 3 etc.. to 6 with equal chances for each. Throw 12 D6 die and you can get the performance for a team of average players. The average will be 3.5 * 12 = 42, and the range (smallest possible number to the highest number you can get) is 12 to 72 BUT each number in that range is no longer equal chances, it becomes normally distributed. (link for chance distribution of 12 D6 https://anydice.com/program/628)

Put two such average teams against each other, and it is a 12D6 versus 12D6 battle, the larger value wins the match. You can intuitively know the win chance over a large number of matches is 50%, but what happens if you add a high skilled player? a D20 player.

12D6 versus 11D6+D20. The team with the D20 player is not going to automatically win, but they will win more than 50% of the time. How often is what my simulations calculate, except instead of just 1 player being allowed to be different from a D6, all 24 players are allowed to be.

When it the MM is completely random, the skill=1800 player wins 68% of the time. The skill = 1000 player win 50% of the time, and the skill = 200 player wins 32% of the time.

Edited by Nightbird, 12 June 2019 - 09:16 AM.


#132 Feral Clown

    Member

  • PipPipPipPipPipPipPip
  • 915 posts

Posted 12 June 2019 - 02:19 PM

View PostThe6thMessenger, on 11 June 2019 - 03:19 PM, said:


My concern is that, well Dakka could both farm damage and wins, maybe even the ATMs and the LRMs (with DF). You could also bring an AMS (Anti-Missile-System), to just pad MS anyways. They might be opposite goals, but with certain weapons, they have near same result.

AMS should only have given CBills, not MS. AMS would have been an acceptable measure of skill, if only PGI didn't made it so that it's easy to pad it.


I 100% agree that AMS should in no way inflate match score and was very vocal about it when it was discussed a couple years back.

Also looking at my mech stats for one particular account, I have a reasonable kdr of 1.50 while a very low comparative to most of the other mechs wlr of .88 in a Nova build I am fond of running in quick play. In fact every other mech on that account which has been used in the last four months has a wlr of 1.13 (second lowest) to 2.48.

This anecdotal evidence is good enough for me to conclude for myself that bringing this mech is not helping my team win (either lvom hellbringer, or splat arctic wolfie does) and that it's support mech value to the team is situational and negligible. What it is good for is shutting down one atm cancer eagle as I dps him down.

While not definitive or scientific in the least, there is a reason when my team does CW we bring stuff that kills things well and forgo running these AMS support mechs unless fooling around or doing an event. Overlapping ECM on Hellbringers is a far greater value that Nova's or AMS Kitfox.

#133 Wil McCullough

    Member

  • PipPipPipPipPipPipPipPip
  • The 1 Percent
  • The 1 Percent
  • 1,482 posts

Posted 12 June 2019 - 04:58 PM

View PostEast Indy, on 12 June 2019 - 03:33 AM, said:

He's stating a little too broadly but has a point. W/L in a 12-man game with cascading effects depends on the matchmaker being somewhat accurate in the first place. Current state of the game can drop an excellent player into badly mismatched teams. Team loses, he does better than both teammates and most opponents, but W/L is suppressed.


It doesn't matter because everyone is saddled with the same matchmaker. It's a constant. Obviously there will be short term spikes and slumps but in the longer run, your wlr always determines how much you effect a win compared to everyone else.

#134 The6thMessenger

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Nova Captain
  • Nova Captain
  • 8,104 posts
  • LocationFrom a distance in an Urbie with a HAG, delivering righteous fury to heretics.

Posted 12 June 2019 - 05:38 PM

View PostWil McCullough, on 12 June 2019 - 04:58 PM, said:

It doesn't matter because everyone is saddled with the same matchmaker. It's a constant. Obviously there will be short term spikes and slumps but in the longer run, your wlr always determines how much you effect a win compared to everyone else.


Yup, it's Systematic Error, it's consistent. It's fine.

That being said, it is better to try and limit the error for more accurate data.

Could we please still get 8v8 on Quick-Play and just limit 12v12 on Faction Play?

#135 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 12 June 2019 - 08:48 PM

View PostThe6thMessenger, on 12 June 2019 - 05:38 PM, said:

Yup, it's Systematic Error, it's consistent. It's fine.


Did you just call skill a systemic error? lol

#136 Wil McCullough

    Member

  • PipPipPipPipPipPipPipPip
  • The 1 Percent
  • The 1 Percent
  • 1,482 posts

Posted 12 June 2019 - 10:08 PM

View PostNightbird, on 12 June 2019 - 08:48 PM, said:


Did you just call skill a systemic error? lol


Pretty sure he was talking about the cucked mm that affects everyone

#137 PhoenixFire55

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 5,725 posts
  • LocationSt.Petersburg / Outreach

Posted 13 June 2019 - 02:31 AM

View PostFRAGTAST1C, on 11 June 2019 - 06:31 AM, said:

Is there a word to describe skewed thinking, tap dancing and a form of confirmation bias together?

I like how you have nothing to say on the actual subject and start blabbering some nonsense instead.

View PostFRAGTAST1C, on 11 June 2019 - 06:31 AM, said:

Right. I keep forgetting that it isn't PGI who's screwing up the game. It's the vast chunk of this community to blame as well.

In other words you refuse to acknowledge the fact that other people have the right to have fun in a different way compared to what your definition of fun is. Well, good to know.

#138 PhoenixFire55

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 5,725 posts
  • LocationSt.Petersburg / Outreach

Posted 13 June 2019 - 02:37 AM

View PostThe6thMessenger, on 11 June 2019 - 05:00 AM, said:

but at least it's a step towards better MM than what we got right now.

Its not, because as pointed out countless times, including by PGI themselves, the development of MWO is stopped. Nobody is gonna bother changing MM, and sure as hell nobody who can actually do that in the first place will read this post.

#139 PhoenixFire55

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 5,725 posts
  • LocationSt.Petersburg / Outreach

Posted 13 June 2019 - 02:54 AM

View PostNightbird, on 11 June 2019 - 08:03 AM, said:

You can tweak Elo formulas between seasons and even if it blows up, the damage is contained to a season. For MWO, you cannot say, oh ****, the K value is wrong, pull the plug and start over. (which ofc is what PGI realized after trying Elo) Compared to Elo, WLR is robust, there is no parameters to tweak, and it will never blow up. That is why for a game like MWO where stats are accumulated forever, you want what is robust over that is fragile.

So to change your analogy a little, Elo will always go boom, WLR will never, and that is the difference between the two.

Heh, quite wrong tho. Both Elo and W/L predict an outcome with a certain probability. For different kind of input parameters such as player/team distribution, length of season and nature of how the outcome of a match is decided they will both have a certain average prediction error. Now the thing is, you don't know what this error is gonna be for actual MWO when you apply your MM, you haven't estimated that. It might very well be that your W/L MM will have far worse predictions than an Elo one. And thus your "robustness" will become a burden, since unlike Elo you can't change any K-values in it. So it very well might blow up, and your model doesn't address a question of whether it will or won't.

As for not being able to pull a plug ... its laughable, you can change MM parameters at any point, this doesn't change an already exsisting player distribution along the score curve. And besides, just read up the forum ... people are more than willing to go through a PSR/Elo reset as well if it helps to actually achieve a better quality MM'ing.

You talk about supposed W/L robustness, yet we've already agreed that a quality MM should lead to people having ~1.0W/L ratio across the entire playerbase. But how then will your W/L MM produce any kind of quality matches if a sole input parameter is almost exact same for every player? It won't. Of course it balanced itself out back and forth and W/L won't actually be exactly 1.0 for everyone, but this bouncing around is exactly opposite of being robust, since this isn't a stable state in any way.

#140 PhoenixFire55

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 5,725 posts
  • LocationSt.Petersburg / Outreach

Posted 13 June 2019 - 03:04 AM

View PostDer Geisterbaer, on 11 June 2019 - 08:12 AM, said:

~laugh~ Where exactly was my post "rude" in any sense of the word?

No, you're not taking the "high road" there, you're just showing yourself to be the exact same type of person that you accuse others of being. In other words: pot meet kettle.

Every post that doesn't sing praise to his epic work is rude here apparently.





26 user(s) are reading this topic

0 members, 26 guests, 0 anonymous users