Jump to content

Statistical Analysis Of The 12-0


187 replies to this topic

#181 Sjorpha

    Member

  • PipPipPipPipPipPipPipPipPip
  • Philanthropist
  • Philanthropist
  • 4,475 posts
  • LocationSweden

Posted 23 January 2018 - 01:35 AM

View PostMischiefSC, on 23 January 2018 - 01:08 AM, said:

However to be fair even a straight w/l MM would normalize with sufficient sample size. After a few hundred matches anyone with a good w/l ends up in T2 and as such is playing against T1s which will normalize them pretty quickly.


The rate of normalisation would differ depending on how long you've played the game, for example the w/l of a player with 4000 matches normalises 40 times slower than a player with 400 matches, that's a problem.

I could possibly see it if you based the system on the w/l of your last 200 matches or something, so that the rate of normalisation is equal. It would still start out very chaotic though but probably work decently after a while. Maybe, I'm not sure actually.

Not commenting further on the stuff we seem to agree on, namely that winning is the only relevant prediction and relative level of opposition is the relevant modifier for ratings. I mean, that was your main point to begin with I think, I just get a bit hung up on the different uses and misuses of the term "w/l"

Edited by Sjorpha, 23 January 2018 - 01:40 AM.


#182 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 23 January 2018 - 10:28 AM

View PostSjorpha, on 23 January 2018 - 01:35 AM, said:


The rate of normalisation would differ depending on how long you've played the game, for example the w/l of a player with 4000 matches normalises 40 times slower than a player with 400 matches, that's a problem.

I could possibly see it if you based the system on the w/l of your last 200 matches or something, so that the rate of normalisation is equal. It would still start out very chaotic though but probably work decently after a while. Maybe, I'm not sure actually.

Not commenting further on the stuff we seem to agree on, namely that winning is the only relevant prediction and relative level of opposition is the relevant modifier for ratings. I mean, that was your main point to begin with I think, I just get a bit hung up on the different uses and misuses of the term "w/l"


Over thousands of matches you'd also run into the issue of changes in the matchmaker and matchmaker system. Yeah, you'd want to go with last x00 matches. Or you could weight them on a curve toward recent matches. Essentially a confidence factor.

I'm leaning on the term 'w/l' because it's one people understand and if people don't get that all that matters about a match is who won (and, accordingly, who they won or lost against) we're not going to get anywhere.

#183 sub2000

    Member

  • PipPipPipPipPip
  • Bad Company
  • Bad Company
  • 127 posts

Posted 23 January 2018 - 12:08 PM

LOL. It's 12x12 game. Rating system which doesn't take into account quality of the players on your side vs enemy won't provide good MM, or will work too slow. Considering low population weighted Rating system is a must. Damage score/ mech weight, kills, components, assists, scouting, even lance etc. are all very relevant characteristics.
Most damage actually isn't, there are plenty of 12:0 games when winning mechs end with 30% health. Shooting to kill is essential characteristic of skill.

Edited by sub2000, 23 January 2018 - 12:09 PM.


#184 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 23 January 2018 - 12:18 PM

View Postsub2000, on 23 January 2018 - 12:08 PM, said:

LOL. It's 12x12 game. Rating system which doesn't take into account quality of the players on your side vs enemy won't provide good MM, or will work too slow. Considering low population weighted Rating system is a must. Damage score/ mech weight, kills, components, assists, scouting, even lance etc. are all very relevant characteristics.
Most damage actually isn't, there are plenty of 12:0 games when winning mechs end with 30% health. Shooting to kill is essential characteristic of skill.


No. You are wrong.

The teammates you get in QP are just like the teammates the other side gets - often the same people. Over enough matches (about 80) they even out and what's left in your results is your performance.

It's called Law of Large Numbers. The simple version is that with a large enough sample the variables average out over time.

It's why every single matchmaker used by any game or system uses win/loss data as a basis. Because the other information can help you identify HOW or WHY someone won but the actual winning part is only accurately represented in who won or lost.

There's a huge range of things that go into winning matches. In the end the HOW and WHY is only relevant to the individual players development. For the matchmaker all that matters is who wins, which is the best predictor of who will win next time.

#185 Growlly

    Member

  • PipPipPip
  • Bad Company
  • Bad Company
  • 82 posts

Posted 24 January 2018 - 07:22 AM

I disagree with Mischief and favor Tarogato's multivariate approach. Here are some quote snippets:

View PostMischiefSC, on 22 January 2018 - 10:35 PM, said:

@tarogato;
Your w/l is based on the average impact of your performance on w/l.

If you attempt to include anything other than w/l in a matchmaker designed to match teams ability to win or lose matches you are designing a flawed matchmaker. If you want to match teams for build types or sniping maps or around some other criteria than their odds of winning vs each other then great - use other factors.

However if you include any other factors in it then your matchmaker is bad and the data unreliable.

W/L only. Or it's going to be as bad (potentially worse) than PSR.

View PostMischiefSC, on 23 January 2018 - 12:38 AM, said:

At this point we're just trying to make it clear that win/loss history is all you take into account when building a matchmaker to predict peoples odds of winning or losing a match. Adding k factor and how it's computed can come after we are clear on what a matchmaker is even trying to do.

KDR, damage, match score, kills per match, favorite color and anything else has no place at all in the calculation process for predicting win/loss in this sort of environment. Until we can get even just the smart people (and I count Tarogato high on that list) to understand the how/why of that we're going to end up with stuff like PSR, which may as well split players up by total matches played in the last rolling 365 day cycle.

View PostMischiefSC, on 23 January 2018 - 01:08 AM, said:

Well aware of of the difference; the point is that your winning (and losing) is the basis of a matchmaker that doesn't suck. Not even going to say a 'good' matchmaker, just any matchmaker that isn't terrible.

Also that at no point does damage/KMDDs/whatever else come into it as useful for matchmaking.

If we had group/pug queue scores split.


As a counterpoint to the "WLR only" approach, here are examples:
1) Team A is Tier 4, average WLR 1.2. Team B is Tier 1, average WLR 1.2. Team B is inherently advantaged because they have more experience, and they fought tougher opponents to get that 1.2 rating. This is why sports analysts look at strength of opponents in predicting match or fight outcomes.
2) Team A is average WLR 1.2, piloting 12 Vindicators. Team B is average WLR 1.2, piloting 12 Deathstrikes. Team B is at an advantage because they are in superior mechs. This is why there is so much emphasis on tonnage and mech balance in the game.
3) Team A is average WLR 1.2, piloting 12 Deathstrikes that they just bought on sale. They have no skill points, their loadouts aren't optimized (not sure if I need dual NARCs?), and they're used to piloting light mechs. Team B is average WLR 1.2, piloting 12 Deathstrikes that they've used since launch.
4) Team A is average WLR 1.2. They have s o l i t u d e with his 8.5 WLR and 11 potatoes with 0.5 WLR. Team B has 1.2 WLR for each player on the team.
5) Team A are completely PUGs and Team B has several players sync dropped as a unit.
6) It's Terra Therma. Team A are laser boats and Team B are dakka.

And so on. I agree that WLR is very important, but it shouldn't be the only predictor.

Edited by Growlly, 24 January 2018 - 07:24 AM.


#186 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 24 January 2018 - 03:33 PM

View PostGrowlly, on 24 January 2018 - 07:22 AM, said:

I disagree with Mischief and favor Tarogato's multivariate approach. Here are some quote snippets:





As a counterpoint to the "WLR only" approach, here are examples:
1) Team A is Tier 4, average WLR 1.2. Team B is Tier 1, average WLR 1.2. Team B is inherently advantaged because they have more experience, and they fought tougher opponents to get that 1.2 rating. This is why sports analysts look at strength of opponents in predicting match or fight outcomes.
2) Team A is average WLR 1.2, piloting 12 Vindicators. Team B is average WLR 1.2, piloting 12 Deathstrikes. Team B is at an advantage because they are in superior mechs. This is why there is so much emphasis on tonnage and mech balance in the game.
3) Team A is average WLR 1.2, piloting 12 Deathstrikes that they just bought on sale. They have no skill points, their loadouts aren't optimized (not sure if I need dual NARCs?), and they're used to piloting light mechs. Team B is average WLR 1.2, piloting 12 Deathstrikes that they've used since launch.
4) Team A is average WLR 1.2. They have s o l i t u d e with his 8.5 WLR and 11 potatoes with 0.5 WLR. Team B has 1.2 WLR for each player on the team.
5) Team A are completely PUGs and Team B has several players sync dropped as a unit.
6) It's Terra Therma. Team A are laser boats and Team B are dakka.

And so on. I agree that WLR is very important, but it shouldn't be the only predictor.


"This one time in band camp" exceptions are statistically irrelevant. They are as likely to favor you as the other team. They have the exact same potential to happen to anyone and everyone in every match.

That's the point. Everyone is swimming in the same pool. How well you swim plays out over time (at least 80 matches) and washes out the atypical stuff.

If I swim faster than you but I get kicked or grabbed out of the gate I'll be behind you - for a bit. However if we're swimming for 1 mile I'm going to pull ahead of you. Or if we swim 80 or 100 short races you'll get grabbed or kicked early as often as I do and overall I'll still win more, relative to our actual skill disparity.

Again. Atypical situations can skew damage, score, kills and anything else. However all of it filters out in to winning. Since the matchmaker is trying to balance teams based on equal odds of winning and not equal damage or score you only look at who won or lost.

#187 Tahawus

    Member

  • PipPipPipPipPip
  • Little Helper
  • Little Helper
  • 189 posts

Posted 24 January 2018 - 07:49 PM

Taragato, I'd be happy to work with you on some of the stats if you continue this type of thing.

#188 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 24 January 2018 - 08:39 PM

I want to reiterate though -

The Jarls List has become, rightly so, the go-to place to get stats. It's an awesome resource and it makes me very, very happy to see that anymore when I see people post stats it's right out of that. It's got a spot of prime real estate on the top right edge of my Bookmarks tool bar.

The stat mining that Tarogato has done in things like this is also amazing and very, very valuable to the community. Players like Tarogato who put this effort in are an invaluable community resource and should have nice things thrown at them like dolla bills at nerdy but hot strippers.





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users