Jump to content

Stats Study: Matchmaker Is Unfair

Balance

344 replies to this topic

#41 Too Much Love

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 787 posts

Posted 19 April 2017 - 12:49 PM

View PostGhogiel, on 19 April 2017 - 12:27 PM, said:

Not really, because stats have nearly nothing what so ever to do with tier. The MM doesn't know what players stats are and in all likelihood one team out the 2 will almost certainly have some adv in stats*
It is said to be like you are saying, I heard that too. But in fact it's not like that. It seems like we are told only the part of the truth - matchmaker uses tiers. But what it does besides that? That's simple? Are you sure? The defferences between the teams are so big that it can't be "coincidence" (it's a plot!Posted Image ).

View PostGhogiel, on 19 April 2017 - 12:27 PM, said:

And lets face it the advantage you are talking about is like 10-25 points difference in matchscore between the teams and ~0.2 difference W/L, which isn't anything to write home about and is about what everyone should expect.

Not true. I specifically stated, that this defference is huge.

View Postdrunkblackstar, on 19 April 2017 - 09:06 AM, said:

For example in the match №1 the winners average K\D was 1.6, the losers - 0.92. W\L - 1.3 and 1.06, MS - 260 and 208 respectively.

In the match №2 the winners average K\D was 1.31, the losers - 0.91. W\L - 1.29 and 1.01, MS - 238 and 198 respectively.


#42 Trev Firestorm

    Member

  • PipPipPipPipPipPipPipPip
  • The Boombox
  • The Boombox
  • 1,240 posts

Posted 19 April 2017 - 01:01 PM

View Postdrunkblackstar, on 19 April 2017 - 12:29 PM, said:

It would be based not on psychological effects, but on science.

False. Seeing stats like that results in a demoralizing effect which often leads to not only even worse than normal gameplay effort but also intentional suicides to get to the next match.

Edit: Clarification, I am still talking about noobmeter/ WoT stats mods. Which shows the stats/predictions at the start of the match.

Edited by Trev Firestorm, 19 April 2017 - 01:03 PM.


#43 SuomiWarder

    Member

  • PipPipPipPipPipPipPipPip
  • The Raider
  • The Raider
  • 1,661 posts
  • LocationSacramento area, California

Posted 19 April 2017 - 01:45 PM

I have forgotten a lot of my statistics from college by now, but I suspect that 12 matches (likely in a row time-wise) is not a significant sample of the many thousands of game splayed or even say if 3,000 matches happened on the day of the sample collection.

#44 Too Much Love

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 787 posts

Posted 19 April 2017 - 01:51 PM

View PostSuomiWarder, on 19 April 2017 - 01:45 PM, said:

I have forgotten a lot of my statistics from college by now, but I suspect that 12 matches (likely in a row time-wise) is not a significant sample of the many thousands of game splayed or even say if 3,000 matches happened on the day of the sample collection.

What about reading skills left from college? Seems you forgot them too.

View Postdrunkblackstar, on 19 April 2017 - 09:06 AM, said:

I understand that 12 matches is not enough to make really representative sample and come to the firm conclusions. But I believe it shows the trend. I encourage other players, who want to spend time and effort, to make their own examine of the matchmaker.

Plus that:

View PostJman5, on 19 April 2017 - 12:09 PM, said:

Large sample sizes are more necessary when the effect is subtle. Like if if a coin flipped heads 51% of the time you would have to flip it a lot to conclude that with any confidence.

In this case 11 out of 12 games went as predicted, which isn't subtle at all!


#45 DAYLEET

    Member

  • PipPipPipPipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 4,316 posts
  • LocationLinoleum.

Posted 19 April 2017 - 01:55 PM

12 match is low, like really low. Maybe WR even out after a 100. But with tier mixing, stats are pretty useless for balancing.

The Weather: Its not unfair it's unpredictable.
I had a **** night last night, one of the worst but it was to be expected(by me not by the mm). Beside my no basic fs9-h, there was no way the mm would know i would end up derping all night. What if there was 4 other guys like me always on the same side last night? Statistically, the mm was doing its job but from your observation the mm screwed up. It could be that half your team didnt chose that game mode and arent playing it while the other half is spliting up. Maybe you got all the disco that night. 12 match is low for any stats.

The Climate Change: It's not just psr.
The MM dont help itself. It doesnt not enforce standard like 3/3/3/3 so stats can fluctuate wildy. It does not differentiate between your weight class or chassis so at best you have an average that disregard predictable high and low peek. You're only one guy out of twelve randoms. MM valve release tiering and the valve is per player not per team. Not all mech are created equal but you only have one psr. I could go on but why bother. If at least the mm keep the newbies out of the grinder it would be doing a decent job but i dont feel like it does that.

#46 Pixel Hunter

    Member

  • PipPipPipPipPipPip
  • Knight Errant
  • 388 posts

Posted 19 April 2017 - 02:09 PM

I think from another angle here: Just why in blue blazes is matchmaker puting teams together that look like that?! if it's trying to balance out teams to a 1.0 K/D ratio then why does one team have a CLEAR advantage in almost every category? match score is irrelevant IMO becuase the actual STATS the entire POPULATION of data of what they have done for the last 6 months shows that one team on the whole often has a clear cut advantage when it comes down to the numbers

#47 Ghogiel

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • CS 2021 Gold Champ
  • CS 2021 Gold Champ
  • 6,852 posts

Posted 19 April 2017 - 02:34 PM

View Postdrunkblackstar, on 19 April 2017 - 12:49 PM, said:

It is said to be like you are saying, I heard that too. But in fact it's not like that. It seems like we are told only the part of the truth - matchmaker uses tiers. But what it does besides that? That's simple? Are you sure? The defferences between the teams are so big that it can't be "coincidence" (it's a plot!Posted Image ).

what are you jabbering on about? it uses tiers and matches mech weight classes. As soon as tier matching is within the thresholds, and the weight classes are matched, the matchmaker is finished it's job it does nothing else except spit the players into match vote.

Quote

Not true. I specifically stated, that this defference is huge.

Neat! I too can also look at your own data and cherry pick 2 results just like you just did to show that the difference is statistically insignificant. eg:

W/L> 1.14 MS>253 vs W/L> 1.19 MS> 242

or

W/L>1.22 MS>252 vs W/L>1.175 MS> 245

Sooo... yeah no. RIP




Also for example> most matches I drop into. If I could be arsed to look up irrelevent players on the leaderboards ( which have nothing to do with how a match is formed) my stats would fubar any semblance of randomely distributed stats between teams. 9/10 teams I would be on would look stacked via stats. In reality all the MM is doing is just usually putting some T1 guy with a 1.5 W/L on the other side because it has no idea there is a difference between a proton and a T1 mechwarrior redshirt

Edited by Ghogiel, 19 April 2017 - 02:35 PM.


#48 Bishop Steiner

    ForumWarrior

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • The Hammer
  • The Hammer
  • 47,187 posts
  • Locationclimbing Mt Tryhard, one smoldering Meta-Mech corpse at a time

Posted 19 April 2017 - 02:45 PM

need hundreds to thousands of comparisons, from multiple sources for this to be conclusive, sadly 12 results from a single common source? Not near enough.

#49 PhoenixFire55

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 5,725 posts
  • LocationSt.Petersburg / Outreach

Posted 19 April 2017 - 02:51 PM

View PostMystere, on 19 April 2017 - 09:28 AM, said:

Sigh! So much time, money, effort, and other very limited resources have already been spent on finding the "perfect" matchmaker, when random selection still seems to be the best and simplest option.


Its not the best option, but it is simplest and more effective than the current one. Since T1 is filled with complete potatoes it is as good as random anyway, but random at least doesn't take ages to find a match ...

Regardless, unlike what people might think, MM isn't there to provide us with even matches, it is there to create an illusion of it by giving you matches you can't lose and matches you can't win with equal probability. Since all PvP games have a matchmaker it is a necessity to have one in order to remain a Minimally Viable ProductTM.

#50 PhoenixFire55

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 5,725 posts
  • LocationSt.Petersburg / Outreach

Posted 19 April 2017 - 02:59 PM

View PostSavage Wolf, on 19 April 2017 - 10:10 AM, said:

Why is a Win/Loss ratio of 1.00 bad? How else would you measure being matched against people of your own skill level?


The thing is W/L doesn't tell you anything. You can have W/L of ~1.0 while your every match is a nailbiter that ends up 12-11 and 11-12, or you can have W/L of ~1.0 while all your matches end 12-0 and 0-12 with equal probability. That is the entire point of the OP. MM is creating an illusion of balanced matches by placing everyone on "sure-win" and "sure-loss" teams equally often. That has nothing to do with actual skill-based MM tho and as others said creates a horrible gaming experience for pretty much everyone involved.

#51 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 19 April 2017 - 03:11 PM

View PostBishop Steiner, on 19 April 2017 - 02:45 PM, said:

need hundreds to thousands of comparisons, from multiple sources for this to be conclusive, sadly 12 results from a single common source? Not near enough.


Nah, 40 would work. It wouldn't be something to bet your life on but it would be a pretty viable sample size. If you played cards with random people for 40 hands you could generally tell what won hands and what didn't, etc.

You don't needs thousands of sample matches. Not to determine general trends like team balancing. If you wanted to get accurate data on, say, ballistics + ppc vs laserboat performance you'd want 200 or more but for what OP is talking about 40 would be a good sample. Even from a single source.

#52 Clydewinder

    Member

  • PipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 447 posts

Posted 19 April 2017 - 03:21 PM

the matchmaker's job is to take anyone with 1.01 w/l and drop them on their heads by going max unbaked potato with Bravo and Charlie lance. there used to be a little "upside down stick man" icon in the upper left corner that would let you know ahead of time that it is your turn to be punished but i believe it is gone now.

when i can take a joke build vindicator and score in the top 3 for damage and match score, i know the MM is gone full on spud cannon because of a positive w/l metric.

the tier system is not workable for MM balance because it is an XP bar and only by running Victors and/or Vindicators can you make that bar go back down again

#53 Zookeeper Dan

    Member

  • PipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 487 posts
  • LocationBeer City USA

Posted 19 April 2017 - 03:27 PM

I appreciate the work, but this is meaningless.

You're saying the team with the better players wins.

And let's look for a reasonable​ alternative to "The Matchmaker was created to make unbalanced teams".

What we know is that the matchmaker takes more than skill (tier) into account, like tonnage and time in queue. If it's trying to balance all of these with a limited player base it is impossible to make teams that have even skill. In uneven skilled teams the better team has a higher probability of victory.

To prove that the matchmaker is purposely making uneven teams take a measure of skill and plot the average of that measure per team. You should see a normal distribution (the classic bell shaped curve). And there are statistical tests that you can run on the data to check for a normal distribution. If you don't have a normal distribution it points to something else going on.

And yes, I will be marching for science this weekend!

Edited by Zookeeper Dan, 19 April 2017 - 03:28 PM.


#54 Ghogiel

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • CS 2021 Gold Champ
  • CS 2021 Gold Champ
  • 6,852 posts

Posted 19 April 2017 - 03:56 PM

View PostMischiefSC, on 19 April 2017 - 03:11 PM, said:

Even from a single source.

I expect that there would be quite a different out come depending on the source. Someone with a 5 W/L is going to put about a 20-25% difference on that stat compared to the T1 avg on every game.

#55 Ted Wayz

    Member

  • PipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 2,913 posts
  • LocationTea with Romano

Posted 19 April 2017 - 04:11 PM

Interesting but flawed.

Too small a sample size first of all.

Second you showed statistics but not over how many matches those statistics were achieved by the players. If Team A has averaged playing 10,000 matches a piece with a .9 W/L I would take them over Team B with an average of 5 matches played and a W/L of 1.5.

What tier were all the players?

If MM were always picking a team of winners versus losers then why isn't the gap between W/L of the two teams in the examples you showed bigger?

How does the results shown from your sample compare to the overall population?

Could go on and on but the study doesn't do anything for me.

#56 Mister Blastman

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Survivor
  • Survivor
  • 8,444 posts
  • LocationIn my Mech (Atlanta, GA)

Posted 19 April 2017 - 04:20 PM

Posted Image

#57 LordNothing

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 17,135 posts

Posted 19 April 2017 - 04:25 PM

there simply are not enough players for the mm to function properly, so most of the time it gives up and just matches what ever players are in queue.

#58 BigFatGator

    Member

  • PipPipPipPipPipPip
  • Shredder
  • 265 posts

Posted 19 April 2017 - 06:07 PM

Thanks blackstar. Looking through the games and screenshots, it looks like if we had a matchmaker that evened out the KDR between two selected teams and ignored tier we'd get closer matches.

Seriously look at the blowouts and the closer games and the KDR ranges on each team.

Yeat, 12 is a small sample size. On the other hand, if I buy a dozen eggs and the first three I crack are rotten I think I'm tossing the dozen out. If someone want to cry 'statistics' they can eat the rest of the eggs. (more accurately, general conclusions on huge differences aren't needing 100 replicates)

That last bit will make more sense tomorrow Posted Image

#59 Snazzy Dragon

    Member

  • PipPipPipPipPipPipPipPipPip
  • The Defiant
  • The Defiant
  • 2,912 posts
  • LocationRUNNING FAST AND TURNING LEFT

Posted 19 April 2017 - 06:13 PM

The ultimate problem is that Terribad Terry and his LRM 90 cyclops have made it to tier 1 along with me and my average and by all means "tier 2 at best" performance, and players like Bows3r and Proton, are all treated the same by matchmaker because of a ****ing stupid upwards biased pilot "skill" rating system.

We are not the same players, PGI.

#60 EgoSlayer

    Member

  • PipPipPipPipPipPipPipPip
  • Wrath
  • Wrath
  • 1,909 posts
  • Location[REDACTED]

Posted 19 April 2017 - 06:40 PM

While only examining the 12-0 matches, this study is more through:

https://mwomercs.com...is-of-the-12-0/





2 user(s) are reading this topic

0 members, 2 guests, 0 anonymous users