Jump to content

Elo Based On Win/loss (Or Anything Based On Win/loss) Is Silly


167 replies to this topic

#141 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 11 November 2013 - 11:12 PM

View PostVictor Morson, on 11 November 2013 - 10:29 PM, said:


Your math is already off, again, on random functions. Because you are 8.333% of your team does not mean you influence 8.333% of the outcome. Your base logic is wrong and it's clouding everything you are doing after that. Not even accounting for skill variance, let's look at tonnage - are you saying a Highlander impacts the round as much as a Locust? Are you saying a Frakenmech impacts the round as much as an optimized one?

There are many rounds you have 0% impact. For example, a disconnected mech can still be on the winning team, as one that is not pulling it's weight. I'd say that since the cap speed got raised, instances of 100% factors are very rare. That means that 8.333% number is, effectively, complete garbage.

Your core numbers are broken before they begin man. That is why everything you say to prove your point after doesn't work. It's built on a broken foundation.


Math Victor. Show me the math that backs up what you're saying. What you keep ignoring is that what mech you bring and how you play it is part of your Elo. If you bring an unarmed Locust then you're improving your odds of losing and lowering your Elo - which it should. You bring a poptart Highlander and you know how to use it well you improve your odds of winning. You bring a Highlander and your aim is poor, you'll be lucky to come out neutral in your impact.

All of which is part of the equation.

Once again, show me the math that says your impact on a match can not be accurately measured in win/loss. Show it to me. I'll steal it, publish it and become incredibly famous as the guy who showed that all the data that Facebook, Google and the NSA are mining is absolutely useless and completely random and unable to statistically model the input of a single person. Same with marketing, same with performance metrics for god knows how many jobs. I would in fact be putting myself out of work given that I work with statistical modeling that makes tracking win/loss in something like MWO look like childs play but still. It'd be worth it because I'd be redefining a whole field of mathematical science. I'm not being sarcastic here Victor, I'm totally serious. Show me in statistical data how win/loss can not be used to adequately measure your impact on the ability of the team to win or lose games and I'll make you famous.

So data, Victor. Hard facts and math. Show me how if I track your performance over 100 games with random players I can not statistically measure your impact on their probability of winning or losing.

At this point I've gone about as far as I can on a forum. It'd be disingenuous to say that you'd need 4 years of schooling to get a degree in applied statistical modeling to have the tools to do it. At the risk of belittling a lot of math professors I'd say you could pick up a working command of it in a couple of weeks in any statistics driven job running reports or maintaining databases, software does all the heavy lifting and google remembers all the important formulas for you anymore.

I'll just end with - you're wrong. I totally get why you feel the way you do, I understand the psychology that drives it and even the psychology that makes you want to drill down deeper on it when challenged. I deal with it every single day at my job - reps and agents who say that it's just 'bad luck' that their stats are what they are and they get all the bad calls and their metrics don't accurately reflect because of REASONS. I can show them in broad or granular detail exactly what the problem is and where and why and other people who consistently perform better in the same situation month after month after month. It doesn't matter, they issue isn't their perception it's that the literally thousands of data points that accurately do their job for everyone else don't apply to them because, well, REASONS.

There's no amount of facts or data that can change your mind at this point and that is what it is but I think for people reading I've hopefully broaden a few horizons on it. It's unfortunate you feel the system is somehow punitive and I can only imagine how frustrating that can be. Best advice I can give you is that in a while, after you've moved away from this debate some you should keep a tracker for 100 matches. Just mark every time you're in the top 3 and lose a game vs every time you're in the bottom 3 and win a game. You're not being saddled with nubs constantly keeping you down, you're moving the needle a tiny bit every time you play. You're altering the heading on the ship a tiny bit each time and after long enough and enough adjustments your impact is clear as daylight. The data isn't 'muddy'. Data can never be 'muddy'. It's just data and can be filtered and refined. A 1/0 result like win/loss is the perfect way to do so.

#142 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 11 November 2013 - 11:26 PM

View PostGhogiel, on 11 November 2013 - 10:37 PM, said:

Problem is, no one has any idea what my Elo is, how many players are online at off peak times with similar Elo ratings and don't even know the maximum threshold at which the game can pick players from.


I'm.... I won't say concerned but I will say curious about the player pool size. With weight matching to account for and a max 120 second window you don't have to get that far out of baseline Elo to find yourself with a relatively narrow band of options. What % of players hit 'ready' within the last 120 seconds, of those what % are within your approximate Elo band, of those what % are in a mech with a weight that conforms roughly to the needed weight balancing for the match. If matches last 8 minutes on average even if players are almost instantly hitting 'ready' again you've just cut your available options by 75%. If your Elo is in, say, the top 30% even you're trying to weight-match with 91.666% of potential options cut out. Of course most matches are going to involve casting the Elo and weight net wider - you'd need around 300 players in at the same time to even place a match with 0 weight matching and that would just be finding 1 match. More to the point each group of 24 players would need a pool of ~300 to pull from so even with no weight matching requirement you'd need ~7,200 players logged in and playing at the same time to accurately pack every match with players in the middle 70% of skill, give or take. that's a terribly rough estimate that doesn't account for the bell curve of player skill but instead would hope for an even population curve that we don't have.

That's also without weight matching.

Obviously widening the bandwidth a bit to match people within a % range of each other eases that but weight matching tightens it again.

With no telemetry on player populations, complications from adding premades into the matching process, the bell curve on populations within Elo bands, etc. it's slightly on the educated side of a wild guess at numbers but it's a ballpark to start from. Without millions of players no matter what you do you're never going to get a 'everyone on the same level' sort of match. The point of the MM is to at least get it as close as reasonably possible within 120 seconds and account as close as reasonably possible within the same time for weight.

#143 FenrisUlf

    Member

  • PipPipPip
  • Fire
  • Fire
  • 53 posts

Posted 11 November 2013 - 11:34 PM

Tonnage scaling up depending on ELO is stupid.

Cant wait for UI 2.0 tonnage limit so it can be a far more fair game.

almost every game that has matchmaking the devs try to make something smart with it but ends up the worst thing for the game.

Edited by FenrisUlf, 12 November 2013 - 12:17 AM.


#144 Victor Morson

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 6,370 posts
  • LocationAnder's Moon

Posted 12 November 2013 - 02:01 AM

View PostMischiefSC, on 11 November 2013 - 11:26 PM, said:

Math Victor. Show me the math that backs up what you're saying.


You can't stick math on top of a flawed premise. No matter how much you try to calculate a bad idea, it remains a bad idea.

If you want an example, look at Paul's "Ghost Heat Maths." It might be the dumbest thing ever committed to paper in game design history. A mathematically calculated bad idea.

Others HAVE given you hard data, and simply put, the sample data you are using is faulty. It's as simple as that. If you want a more practical example just go look at the fact ELO isn't working at all in the real game.

Long and short.. ELO works, in a controlled environment. There are many games where you could accurately utilize ELO. It's got it's flaws even then (See the previous post on ELO Drift) but in this environment the data is worthless for the end goal of sorting more skilled players against each other, and less experienced players with each other. The goal here is not some study over the course of 2,000+ games. It should be sorting players into roughly the right games after under 50, which you've admitted repeatedly would not be enough to be accurate.

The system is broken for what you want it to do, period!

Edited by Victor Morson, 12 November 2013 - 02:05 AM.


#145 Black Ivan

    Member

  • PipPipPipPipPipPipPipPip
  • Survivor
  • Survivor
  • 1,698 posts

Posted 12 November 2013 - 10:07 AM

From all my experience with MWO I have to say ELO is wortless.
Games should be balanced on tonnage and possibly Battle Value to make them equal.

#146 Hauser

    Member

  • PipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 976 posts

Posted 12 November 2013 - 01:36 PM

View PostVictor Morson, on 12 November 2013 - 02:01 AM, said:

You can't stick math on top of a flawed premise. No matter how much you try to calculate a bad idea, it remains a bad idea.


The premise is that even with all the randomness of the available population, the mechs, experience of other pilots on either side, the chaotic nature of the game, even with all those factors beyond an individuals control, an individuals players input is still significant enough influence the outcome of a game.

Now unless you're planning on winning a Nobel prize that is not a flawed premise.

View PostVictor Morson, on 12 November 2013 - 02:01 AM, said:

Others HAVE given you hard data, and simply put, the sample data you are using is faulty. It's as simple as that. If you want a more practical example just go look at the fact ELO isn't working at all in the real game.

(...) but in this environment the data is worthless for the end goal of sorting more skilled players against each other, and less experienced players with each other.


Now you are again confusing Elo, the rating system, with the match maker, the system that puts players together. If you want to talk about the match maker that is okay. Lets make a new thread for that.

#147 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 12 November 2013 - 05:39 PM

View PostVictor Morson, on 12 November 2013 - 02:01 AM, said:


You can't stick math on top of a flawed premise. No matter how much you try to calculate a bad idea, it remains a bad idea.

If you want an example, look at Paul's "Ghost Heat Maths." It might be the dumbest thing ever committed to paper in game design history. A mathematically calculated bad idea.

Others HAVE given you hard data, and simply put, the sample data you are using is faulty. It's as simple as that. If you want a more practical example just go look at the fact ELO isn't working at all in the real game.

Long and short.. ELO works, in a controlled environment. There are many games where you could accurately utilize ELO. It's got it's flaws even then (See the previous post on ELO Drift) but in this environment the data is worthless for the end goal of sorting more skilled players against each other, and less experienced players with each other. The goal here is not some study over the course of 2,000+ games. It should be sorting players into roughly the right games after under 50, which you've admitted repeatedly would not be enough to be accurate.

The system is broken for what you want it to do, period!


Where is the data. Show me the data that anyone has put up showing Elo doesn't work in 12 v 12 pugging. As I said elsewhere it would be worthy of publishing since it would disprove a whole field of mathematical study and the associated fields of statistical sociology. It would be a big deal. Also please show me how to quantify a 'bad idea' statistically. I'd like to know.

The problem with Ghost Heat is that mathematically it DOES work. That's a big complaint - it mathematically does what it's supposed to do, stop boating of specific weapons. The math is absolutely sound. How fun it is and how complicated it is may be another issue. It works though, exactly like it's supposed to. It makes boating potentially viable while mathematically unwise save in specific situations. 3xLLs aren't that bad - you pay a penalty a bit higher than firing an extra small laser but it's enough to bleed off the better damage to heat ratio that LLs give over MLs.

Which brings us back to what I told you from the beginning. The math behind Elo on win/loss is rock solid. Fundamental even. Accurate and perfectly viable. How it's used in the matchmaker, how matchmaking works with the player populations and Elo band populations we have now as well as how playing in a system that consistently nudges you towards a 50/50 win/loss rate is another thing. How it balances via high and low Elo on the same team to reach a target number, that's another. Statistically it's absolutely correct and viable and plays out accurately over the long haul but in the short term is it fun?

At the moment I'm just going to ask you to take this on faith -

These two teams are statistically equal -
1200 - 1600
1800 - 1600
1800 - 1600
1400 - 1600
2000 - 1600
1500 - 1600
1500 - 1600
1600 - 1600

If they play 400 matches, both teams will end up with ~50/50 win/loss rate.

HOWEVER

In the short-term those matches will swing pretty wildly. The short term statistical variation will be a lot bigger than if both teams were right up and down 1600. THAT is the problem right now.

In the long term aggregate it's accurate and statistically appropriate. In the short term however you're going to have losses that will highlight the low performers way under-performing while the higher performers did well, just not good enough to carry the dead weight. In the matches you win the high performers will do a bit better while the low performers just have to be about average, thus making your wins feel less exceptional than your losses feel catastrophic.

Does that make sense? That's why I'm using the 1/0 to define win/loss. An amazing win is no more important than a cheap cap-win for Elo. A 0-12 loss is no different than an 11-12 loss to Elo, they just feel different to the player. That's why win/loss is so critical for having a balanced system; it's actually got a huge amount of flexibility to help 'wash out' extremes and come to a better average. Also remember that you're as likely to be in the straight 1600 team as the wide balance team. Don't go back to thinking that invalidates the system; statistically they are absolutely even and the match is perfectly 'fair' in context of odds of winning/losing. It's the experience, what those wins and losses feel like, that changes.

I've argued repeatedly that a system that pushes harder towards matching 1600s to 1600s is a way better experience than matching 2000s to 1200s to fill the same team. How to get to that starts to get more complicated - do you narrow the k-value, the points you gain/lose in each match? It'll prevent wider swings but take more matches to 'seat' you in the right range. More players and getting those players more skilled to fatten up the higher Elo bands is great but that's outside the scope of the MM to fix. Every 1800, 2000 Elo player who doesn't have enough 1800-2200 Elo players to fill a match with is having to push down into a 1600 Elo match and be balanced by a 1400 Elo player on his team. Again, it's statistically valid but it's a match that can potentially be less fun for everyone. What if he's having an off day? Bluntly he is capable of carrying more than 8.333% of the teams value but what if he's having an off day? It's valid for everyones Elo still but it makes for wider swings of the pendulum as it moves towards a balance point.

You want it to sort people accurately after 50 matches? Get more people so there's enough people in each Elo band to fill matches with 24 like people with balanced tonnage every 120 seconds.

Wait until CW starts - what if CW matches don't use Elo and pull populations via other criteria? What happens to the lone wolf population dropping in random matches still using Elo?

Elo works. The question is how do you use that tool and the others available to make better match experiences.

#148 Roadbeer

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 8,160 posts
  • LocationWazan, Zion Cluster

Posted 12 November 2013 - 05:43 PM

OMG, ladies, you're both pretty.

#149 Victor Morson

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 6,370 posts
  • LocationAnder's Moon

Posted 13 November 2013 - 02:29 AM

View PostMischiefSC, on 12 November 2013 - 05:39 PM, said:

Elo works. The question is how do you use that tool and the others available to make better match experiences.


By not using ELO for a game where you have no real control over team victory, for starters!

#150 Tooooonpie

    Member

  • PipPipPip
  • 96 posts

Posted 13 November 2013 - 03:31 AM

Does this game have the required player base to impliment tonnage based matches? I mean, I love the idea myself to bits, and think it will create a lot more balanced games for the most part - But two issues rise from this:

1. If someone picks an Atlas, thats an extremely large chunk of your tonnage gone, and if that player dc's or is dreadful, thats going to cause quite a large problem from to get go, moreso than bad players and dc's currently

2. Will this mean a much greater increase in times finding games? I would tend to think so...

#151 Ghogiel

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • CS 2021 Gold Champ
  • CS 2021 Gold Champ
  • 6,852 posts

Posted 13 November 2013 - 04:37 AM

View PostVictor Morson, on 13 November 2013 - 02:29 AM, said:


By not using ELO for a game where you have no real control over team victory, for starters!

Speak for yourself.

#152 Heffay

    Rum Runner

  • PipPipPipPipPipPipPipPipPipPip
  • The Referee
  • The Referee
  • 6,458 posts
  • LocationPHX

Posted 13 November 2013 - 04:51 AM

View PostTooooonpie, on 13 November 2013 - 03:31 AM, said:

Does this game have the required player base to impliment tonnage based matches? I mean, I love the idea myself to bits, and think it will create a lot more balanced games for the most part - But two issues rise from this:

1. If someone picks an Atlas, thats an extremely large chunk of your tonnage gone, and if that player dc's or is dreadful, thats going to cause quite a large problem from to get go, moreso than bad players and dc's currently

2. Will this mean a much greater increase in times finding games? I would tend to think so...


You can partially solve that by eliminating the 2 minute limit to sitting in the queue, providing a timer that shows the current wait time for your current weight, and a button to drop out of the queue if you want. That way you can drop in that Atlas if you choose to wait 5 minutes, or get an instadrop in a Hunchback.

#153 Tooooonpie

    Member

  • PipPipPip
  • 96 posts

Posted 13 November 2013 - 06:00 AM

View PostHeffay, on 13 November 2013 - 04:51 AM, said:


You can partially solve that by eliminating the 2 minute limit to sitting in the queue, providing a timer that shows the current wait time for your current weight, and a button to drop out of the queue if you want. That way you can drop in that Atlas if you choose to wait 5 minutes, or get an instadrop in a Hunchback.

Hmmm, I see - Only issue I can see in general though is when things occur like new hero mechs being released, there is usually a lot more of than weight now wanting to join a game, which may cause longer times

#154 Heffay

    Rum Runner

  • PipPipPipPipPipPipPipPipPipPip
  • The Referee
  • The Referee
  • 6,458 posts
  • LocationPHX

Posted 13 November 2013 - 06:18 AM

View PostTooooonpie, on 13 November 2013 - 06:00 AM, said:

Hmmm, I see - Only issue I can see in general though is when things occur like new hero mechs being released, there is usually a lot more of than weight now wanting to join a game, which may cause longer times


That will sort itself out rapidly too. If they release a hero hunchback, people won't drop in other mediums due to the queue length.

#155 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 13 November 2013 - 10:43 AM

View PostVictor Morson, on 13 November 2013 - 02:29 AM, said:


By not using ELO for a game where you have no real control over team victory, for starters!


It's got to be a little depressing to think that way but at this point... okay. Best of luck to you I gues.

View PostHeffay, on 13 November 2013 - 04:51 AM, said:


You can partially solve that by eliminating the 2 minute limit to sitting in the queue, providing a timer that shows the current wait time for your current weight, and a button to drop out of the queue if you want. That way you can drop in that Atlas if you choose to wait 5 minutes, or get an instadrop in a Hunchback.


You know that's not a bad idea but I'm not sure what impact it would have on queues or balance. It's also going to affect the quality of matches if you're going to have the impatient people dropping in inferior mechs, they're going to be inherently less valuable than a patient person in a mech they prefer and have skills in.

I dunno. I'd give it a shot and see what the numbers do.

#156 Heffay

    Rum Runner

  • PipPipPipPipPipPipPipPipPipPip
  • The Referee
  • The Referee
  • 6,458 posts
  • LocationPHX

Posted 13 November 2013 - 10:54 AM

View PostMischiefSC, on 13 November 2013 - 10:43 AM, said:

You know that's not a bad idea but I'm not sure what impact it would have on queues or balance. It's also going to affect the quality of matches if you're going to have the impatient people dropping in inferior mechs, they're going to be inherently less valuable than a patient person in a mech they prefer and have skills in. I dunno. I'd give it a shot and see what the numbers do.


Since Elo is by weight class, if they drop into a mech weight class that they aren't the best at, they'll still be at the proper Elo for that class. Err... I think I'm saying that properly. They'll still be just as valuable in that mech for the match they are in. That's one of the benefits of Elo. After enough games, you are *always* at the right place. ;)

#157 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 13 November 2013 - 06:20 PM

View PostHeffay, on 13 November 2013 - 10:54 AM, said:


Since Elo is by weight class, if they drop into a mech weight class that they aren't the best at, they'll still be at the proper Elo for that class. Err... I think I'm saying that properly. They'll still be just as valuable in that mech for the match they are in. That's one of the benefits of Elo. After enough games, you are *always* at the right place. :D


No, you're exactly correct. Like I said though you're potentially skewing matchmaking a bit. You're also asking it to calculate two options instead of one. I get the concept but it's really going to depend on the framework they've already got for the matchmaker. If it was fast and easy to implement then I'd be all for it. If not then I think CW is going to accomplish something similar in having CW matches not using Elo.

#158 Hayashi

    Snowflake

  • PipPipPipPipPipPipPipPipPip
  • Bridesmaid
  • Bridesmaid
  • 3,395 posts
  • Location輝針城

Posted 14 November 2013 - 05:08 AM

As far as my experience goes ELO does have an effect. On initial implementation the insane streak of wins (and 4+ kills/match) at the start slowed down until a constant 3 wins: 1 loss ratio a few months ago... given I pugged exclusively on some of the chassis I'm the only constant factor in all of the games, so it would be odd that I can be lucky enough to land up on the winning team in such a consistent ratio, if I had nothing to do with the chance of victory. 3:1 could be because the amount of ELO loss from a single loss is greater than the amount of ELO gain from a single victory, since my ELO is high.

Then recently after the hit registration with lasers for high ping players went to hell (and with it, my accuracy and evasion abilities), I've been experiencing a constant stream of defeats with 0-2 kills to my name, which should not have been possible if I had no effect: also, if it was just that I didn't pull my weight as an 'average player' this scenario cannot be explained, as I basically took down tonnage greater than me almost every time, but every team with me still lost. From the point of view of ELO however, if my ELO value is high enough such that the game expects me to kill 3-4 mechs and I only killed 0-2, then I would indeed be pulling less weight than I should be, which would explain a consistent loss trend, as my being there means the average ability of the other team is greater than the average ability of my team minus me - if I don't perform as expected, the result will be that I drag my team down. From both the upward and downward trends, ELO definitely does have a noticeable effect.

It doesn't, however, seem to be a very reactive system. For the initial stabilisation to occur it must have taken close to 200 matches. And my 20-30 loss streak isn't quite enough yet to dock my ELO to a point where I start to matter less in matchmaking.

In cases of the best players like Wispsy, chances are that at that ELO level, for the ELO to remain constant a victory:defeat ratio of 5:1 or 7:1 may be necessary, given that they'll lose that much more ELO value on a loss, relative in proportion to the amount they gain on a win. So while winning nearly nonstop, their ELO value is fairly constant - and you can bet that if he powers down in a cave, his team will be steamrolled essentially every single time.

The further you deviate from the average ELO value, whatever it is, the more and more noticeable this effect will become. To an average ELO player, you don't see much of an impact at all - if you 1v1 a random person from the other team all the time, or in some other way remove both yourself and another random player from affecting the rest of the game, you'll notice your match win-loss rate to be largely 50%. If you are a high ELO player, and you do the same, you'll notice you lose more matches than you win. If you are a low ELO player, and you do the same, you'll notice you win more matches than you lose.

If you don't remove yourself and random players from the equation and play in the usual team environment, high ELO players will notice a win loss rate that is related to their ELO - the better they become, the more often they win. Low ELO players will notice the opposite - they may have win loss rates significantly lower than 1, because losing a single game docks less ELO than a single victory would give them.

Without the effect of ELO, if players were truly matched randomly, then while better players would still win more often, this rate is no longer a predictable one, but would swing widely from 3:1 to 9:1 depending on the time of day and day of week... and if they were to somehow remove themselves from the game with another random player, they will notice a win loss rate of approximately 50%.

A problem with ELO calculations using WLR is the assumption that the player is a lone factor. Should a team of extremely competent people play together, say 3 pros and 1 narbcake, the narbcake will have a far greater ELO than he should, and the 3 pros will have slightly lower ELO than they should. This may result in weird cases of, say, 1:10 KDR and 4:1 WLR.

Edited by Hayashi, 14 November 2013 - 05:29 AM.


#159 van Uber

    Member

  • PipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 284 posts
  • LocationStockholm, Sweden

Posted 14 November 2013 - 05:44 AM

View PostVictor Morson, on 09 November 2013 - 06:18 PM, said:

I think everyone here can agree ELO doesn't work at all right now, and again, this is the reason I continue to directly attribute. Win/loss in a team heavy game is simply not a stat worth tracking, outside of team-only modes.


I'm sorry you have that experience, but it is not one that I share so please ease down on the "everyone" speech.

ELO gives me and my son a competitive experienced based on our different skill-levels in this game. I have a positive W/L ratio in tons of games, yet I end up in matches where I have to pull my weight to succeed. It is a challenge for me to win.

My 8 year old son has by now also a lot of played games. He started out with tons of losses and ended up with a really low ELO-rating. Now his W/L is stabilizing and he is beginning to break even. But, he too has to work hard to win.

Watching him and me taking turns to play is like watching two entirely different games. One has (in comparison) a lot of tactics and trickery, the other is amateurish chaos to be polite (hey he's eight!). But we both have to work for our wins. That's fun. That tells me ELO works pretty decent. The matchmaker can be tuned, sure, but it is on a nice path already and a lot better than other games out there.

#160 Doctor Proctor

    Member

  • PipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 343 posts
  • LocationSouth Suburbs of Chicago, IL, USA

Posted 14 November 2013 - 08:31 AM

All of this talk of ELO and matchmaking is relatively pointless when they're two systems that feed each other. You can say that "No, ELO is correct, it's matchmaking that's broken!" all you want, but the fact of the matter is that the matchmaker uses ELO (among other factors) to build the matches, then uses ELO to predict the matches, and then adjusts ELO accordingly based on the results of the matches. They're intimately tied together, which means that the data going into the ELO system is garbage.

Take the match I just played on Forest Colony. The other team had a premade group of lights with capture accelerators. It was two ECM Spiders, an ECM Raven and a Jenner. They went via the caves and had our cap point down by 25% within a minute or two of the match starting. We sounded the alarm that a group of lights were on cap, and more were coming through the caves. We targeted enemies, told people to come back, etc...but that didn't stop half our team wandering over to the enemy's side of the map, which meant the half defending the base got massacred, followed by the cleanup of the other half of our team as they attempted to organize a cap (which would have failed if the 4 lights with cap accelerators started capping us again).

So, in this scenario we have a premade on one team with 8 other random players versus 12 other random players on our side. Now supposedly people in a premade should have above average ELO scores (which, considering they were laughing about pug stomping, it seems that they did) and this should be evened out by placing higher ELO players on the opposing team. Right? Except that the 4 higher ELO players on the opposing team don't have voice coms, aren't all in the same lance and aren't using a coordinated strategy (in this case, it was bring lights, preferably with ECM and cap accelerators). Therefore, according to ELO rankings, we should be equivalent...except we're not. The side with the 4 man will almost always win in that case because together they're better than the average ELO players as well as the high ELO players that aren't coordinating.

And sure, supposedly this will average out...but will it? Assuming both teams were matched up evenly such that say, there was an average 1600 ELO on each side. That would mean that there should be even odds to win and therefore a relatively small change in ELO. However, the teams aren't really balanced since the premade has an additional factor of coordination on their side, which means that they're effectively higher ELO than they should be. Therefore, they'll be consistently matched against worse opponents, and even if those opponents win then they won't get a large boost because the MM and ELO system thought that they were relatively balanced to begin with. Since the MM supposedly places premades first, then this means that unless there is another premade of the same ELO ranking available, we will always have this issue of a coordinated premade group against individual players of similar ELO that won't be able to coordinate effectively...in other words: pugstomping.





3 user(s) are reading this topic

0 members, 3 guests, 0 anonymous users