Jump to content

Why Elo Doesn't Work Here


633 replies to this topic

#121 Roadkill

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,610 posts

Posted 23 January 2014 - 09:04 AM

View PostRichAC, on 23 January 2014 - 09:00 AM, said:

The matchmaker can't create people out of thin air like God for you to play with. Not every game is going to be perfect. Some games result in a roll regardless of how closee both teams are in ELO. Simply because of better team chemistry. PGI commented on this in their latest update for the latest patch.

They had to widen the ELO gap as it is in the latest patch, because people were having long search times. IMO its because of all the sync droppers and people quitting the game. What is PGI supposed to do about that? They can't change the mindset of society, its just a sign of the times. Its the same story in every pc gaming community.

LoL is the one exception, because I guess Koreans are not as big on cheating or being a whiny sore loser so they attract millions.

Not sure what your point is. Nothing you've said changes the fact that if there's a problem with the way games are being set up, it's the matchmaker causing the problem and not Elo.

I'm not commenting on whether or not there's a problem. I'm just trying to get people to stop blaming a proven system like Elo for something that it isn't responsible for.

#122 Doctor Proctor

    Member

  • PipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 343 posts
  • LocationSouth Suburbs of Chicago, IL, USA

Posted 23 January 2014 - 09:06 AM

These Elo threads never seem to go anywhere since it just devolves into "Elo doesn't work for balancing a team game" vs "Elo works just fine for team games and given enough matches will incorporate your personal skill into your rating over many matches".

Instead of rehashing that, let me just ask a question: If Elo is basically a perfect system for statistically determining your skill, then why are there competing systems? Why do Trueskill or Glicko even exist, when they could just use Elo, which is supposedly perfect? The very fact that these systems even exist in the first place means that there are weaknesses to the Elo system that other systems handle better.

So the questions that we should be asking are: What are the inherent weaknesses of the Elo system that other systems such as Trueskill or Glicko seek to correct? Do those weaknesses apply to MWO? Would another system that addresses those weaknesses, if they exist in MWO, be better at determining player skill?

Edited by Doctor Proctor, 23 January 2014 - 09:09 AM.


#123 RichAC

    Member

  • PipPipPipPipPipPipPip
  • 661 posts

Posted 23 January 2014 - 09:07 AM

View PostRoadkill, on 23 January 2014 - 09:01 AM, said:

Or you could try reading what you're responding to so that your response is actually relevant.

My post was in response to someone complaining about Elo setting up an unbalanced game. Elo has nothing to do with that. Elo just gives the players a skill rating. It's a proven system - if PGI implemented it properly, it works. So if you're in an unbalanced game, blame the matchmaker, not Elo.

What you say is also true, but it isn't related to what I said.

But to my point, if you suck and the matchmaker is working properly, you should be paired up against other people who suck. Even bad players can have fun games if the matchmaker sets them up correctly.


I noticed you chose to respond to sandpit, instead of my response to you in the post right above.

1. ELO is a proven system for 1v1. But I'm not sure they don't take match score into account for "skill ratings". It would be silly if they didn't.

2. People are more concerned with their damage then winning in this community. So they are going to complain about nonsense regardless.

3. The matchmaker can't create people out of thin air like God for you to play with. Not every game is going to be perfect. Some games result in a roll regardless of how closee both teams are in ELO. Simply because of better team chemistry. PGI commented on this in their latest update for the latest patch which I guess you didn't even read. They already investigated your accusations, they have been emailed many screenshots, and apparenlty your just delusional.

http://mwomercs.com/...81#entry3089681

Post your W/L stats before you start complaining. Or do you get mad when you win because you didnt' do enough damage?

They had to widen the ELO gap as it is in the latest patch, because people were having long search times. IMO its because of all the sync droppers and people quitting the game. What is PGI supposed to do about that? They can't change the mindset of society, its just a sign of the times. Its the same story in every pc gaming community.

LoL is the one exception, because I guess Koreans are not as big on cheating or being a whiny sore loser so they attract millions.

Edited by RichAC, 23 January 2014 - 09:09 AM.


#124 Roadkill

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,610 posts

Posted 23 January 2014 - 09:10 AM

View PostSandpit, on 22 January 2014 - 05:00 PM, said:

Is there a problem with the MM? Sure, there's 2 that I see.
1.) Tonnage discrepancies can be a royal pain sometimes but I don't see them often to be honest and the fix to that is coming in tonnage limits

2.) New players getting dropped in with vets. New players aren't going to do as well as vets in customized mechs regardless of ELO. ELO evens the field as far as player skill, not equipment. If they would simply put all new players into their own queue while they're earnign their cadet bonus and allow vetted and approved veterans to drop with them as "drill instructors" to help them out and offer advice, tips, information,etc. it would curb the noob stomp and help them learn the game faster.

If you're not a noob, you have customized mechs, and have been playing for a while? Chances are the stomp happened because you and your team simply did not play very well and I have no sympathy for you and don't see why PGI, MM, etc. should be blamed or changed just because you're not quite as good as you thought you were.

This.

The matchmaker does screw up for these reasons. But other than the tonnage problem, which is allegedly being fixed Soon™, it normally does a pretty good job.

Then the players take over. And it really doesn't take much to turn what should have been a pretty even game into a ROFLstomp.

My favorite mistake (favorite because I seem to do it often myself) is seeing 1-2 Mechs running off on their own. So what do I do? Dummy me goes off to try to help them. Stupid. Stupid stupid stupid. Let the {Surat} die and stick with the main force... it's the only way you're going to have any chance at all.

#125 Amsro

    Member

  • PipPipPipPipPipPipPipPipPip
  • Overlord
  • Overlord
  • 3,436 posts
  • LocationCharging my Gauss Rifle

Posted 23 January 2014 - 09:13 AM

View PostRoadkill, on 23 January 2014 - 08:50 AM, said:

Again, your problem is NOT with Elo. It's with the matchmaker. Your own statement demonstrates this.

The matchmaker is choosing to drop people of differing Elo ratings into the same match. Elo isn't the problem - it just provides the ratings, and it's a proven system. PGI's matchmaker is the problem.


And in your post is the most obvious flaw with the current Elo, everyone has an Elo that has been derived from a Matchmaker that can't figure out what to do, resulting in Elo scores that don't make sense. Elo was not being calculated properly in the beginning and the "fix" for that mistake simply assumed you would have gotten the same result regardless of the terrible matchup.

All the current data is skewed by poor matchmaking, resulting in obsolete or irrelevant Elo ratings.

#126 Anton Shiningstar

    Member

  • PipPipPip
  • 76 posts
  • LocationNew Avalon

Posted 23 January 2014 - 09:14 AM

3 more Atlas stats. This time from my pure PUG Alt
Atlas-D 1.33 W/L (8/6) 14K/9D 1.56 K/D ratio
Atlas-D-DC 1.27 W/L (14/11) 15K/17D 0.88 KDR
Atlas-RS 1.86 W/L (13/7) 13K/8D 1.63 KDR

This D-DC got carried less than My other one, same build BTW.

Side by side with my primary Atlases
My (F)Atlas W/L 1.18 (100/85) , K/D 0.68 (86/127)
My D-DC W/L 1.10 (128/116) K/D 1.11 (165/148)

Edited by Anton Shiningstar, 23 January 2014 - 09:18 AM.


#127 Roadkill

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,610 posts

Posted 23 January 2014 - 09:20 AM

View PostSandpit, on 22 January 2014 - 07:02 PM, said:

What I'm getting at is that it's getting a bit tiring to see everyone running around blaming MM, premades, balance, teamates, etc. because they lost.

That's a very valid point, but it's not what's being discussed in this thread. That's why you're having a hard time. :D

The reason you (generically, not you Sandpit) lost is probably because you played badly compared to how the other team played. But in those cases where it's actually because the teams were unbalanced, the fault for that lies not with Elo but with the matchmaker.

#128 Joseph Mallan

    ForumWarrior

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • FP Veteran - Beta 1
  • FP Veteran - Beta 1
  • 35,216 posts
  • Google+: Link
  • Facebook: Link
  • LocationMallanhold, Furillo

Posted 23 January 2014 - 09:25 AM

View PostRoadkill, on 23 January 2014 - 09:20 AM, said:

That's a very valid point, but it's not what's being discussed in this thread. That's why you're having a hard time. :D

The reason you (generically, not you Sandpit) lost is probably because you played badly compared to how the other team played. But in those cases where it's actually because the teams were unbalanced, the fault for that lies not with Elo but with the matchmaker.

But the match maker is supposed to be unbiased and uncaring. 12 Mechs v 12 Mechs. And Without knowing what the Elo scores are for the teams we don't know if the scores were even but the build match up sucked? Team A had LRMs team B had ECM! Team A rolled! Team a 12 PUGs Team B a 4 Man... Team B should win thanks to better communication in a perfect world. Elo just cannot adjust for those eventualities.

#129 o0Marduk0o

    Member

  • PipPipPipPipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 4,231 posts
  • LocationBerlin, Germany

Posted 23 January 2014 - 09:27 AM

View PostArtgathan, on 23 January 2014 - 05:53 AM, said:


This would actually make each individual's Elo impossible to determine - but it would allow you to accurately determine the Elo for the entire team.

I'm going to take a moment to explain how a W/L ranking system works.

Imagine you have a pool of 20 players, which contains a normal distribution of "skill". They will randomly be formed into teams of 10 and played against each other. Every time a team wins, every player on the team gets +1 to their personal score, every time a team loses, every player on the team gets -1 to their personal score. After every match the teams will be re-formed. They will play 100 matches against each other.

At the end of these 100 matches, we look at everyone's individual scores. What we'll see is that the "good" players will have high positive scores (since their teams tend to win when they are present), the average players will have scores centered around 0 (since they can't reliably make their team win or lose), and the "bad" players will have high negative scores (since their teams tend to lose when they are present).

Most of the players will be "average", with a smaller proportion of the group comprised of "good" and "bad" players.

This is why a W/L ratio can be used to determine "skill".


And this only works within a very small playerpool with players with equal numbers of played games. The more players you add to the pool, the more games you have to play to see any effect. With changing players all the time, it's hardly reliable.
Additionally, the more players you have in each team, the lower is the effect of an individual player to have an influence on the match result.
With bad luck you join the "bad team" most of the time and one player can't beat all others alone when half or more of his team are dead within the first few minutes.

#130 Khobai

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • Elite Founder
  • 23,969 posts

Posted 23 January 2014 - 09:30 AM

Quote

Team A had LRMs team B had ECM! Team A rolled! Team a 12 PUGs Team B a 4 Man... Team B should win thanks to better communication in a perfect world. Elo just cannot adjust for those eventualities.


Which is why ECM shouldnt hard counter LRMs to begin with. That was never balanced and how its still in the game is beyond me. But it definitely screws with ELO big time when one team has a lance of ECM and the other team has a lance of LRM boats.

#131 Joseph Mallan

    ForumWarrior

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • FP Veteran - Beta 1
  • FP Veteran - Beta 1
  • 35,216 posts
  • Google+: Link
  • Facebook: Link
  • LocationMallanhold, Furillo

Posted 23 January 2014 - 09:39 AM

View PostKhobai, on 23 January 2014 - 09:30 AM, said:


Which is why ECM shouldnt hard counter LRMs to begin with. That was never balanced and how its still in the game is beyond me. But it definitely screws with ELO big time when one team has a lance of ECM and the other team has a lance of LRM boats.

Will not even try to argue this truth, but it is how this game runs in spite of us agreeing!

#132 MustrumRidcully

    Member

  • PipPipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 10,644 posts

Posted 23 January 2014 - 09:43 AM

View PostDoctor Proctor, on 23 January 2014 - 09:06 AM, said:

These Elo threads never seem to go anywhere since it just devolves into "Elo doesn't work for balancing a team game" vs "Elo works just fine for team games and given enough matches will incorporate your personal skill into your rating over many matches".

Instead of rehashing that, let me just ask a question: If Elo is basically a perfect system for statistically determining your skill, then why are there competing systems? Why do Trueskill or Glicko even exist, when they could just use Elo, which is supposedly perfect? The very fact that these systems even exist in the first place means that there are weaknesses to the Elo system that other systems handle better.

So the questions that we should be asking are: What are the inherent weaknesses of the Elo system that other systems such as Trueskill or Glicko seek to correct? Do those weaknesses apply to MWO? Would another system that addresses those weaknesses, if they exist in MWO, be better at determining player skill?


But then, who on these boards knows enough about these type of statistics to really answer that?
I certainly don't, and giving the amount of people that wrote similar exhaustive math balance analysis threads like me - there can't be many more in the community. :D

#133 Joseph Mallan

    ForumWarrior

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • FP Veteran - Beta 1
  • FP Veteran - Beta 1
  • 35,216 posts
  • Google+: Link
  • Facebook: Link
  • LocationMallanhold, Furillo

Posted 23 January 2014 - 09:47 AM

I ran SPC for years, I can see trends without looking at the math. It would back me up and tune what I see a bit...

#134 Roadkill

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,610 posts

Posted 23 January 2014 - 09:57 AM

View PostRichAC, on 23 January 2014 - 09:07 AM, said:



I noticed you chose to respond to sandpit, instead of my response to you in the post right above.

1. ELO is a proven system for 1v1. But I'm not sure they don't take match score into account for "skill ratings". It would be silly if they didn't.

2. People are more concerned with their damage then winning in this community. So they are going to complain about nonsense regardless.

3. The matchmaker can't create people out of thin air like God for you to play with. Not every game is going to be perfect. Some games result in a roll regardless of how closee both teams are in ELO. Simply because of better team chemistry. PGI commented on this in their latest update for the latest patch which I guess you didn't even read. They already investigated your accusations, they have been emailed many screenshots, and apparenlty your just delusional.

http://mwomercs.com/...81#entry3089681

Post your W/L stats before you start complaining. Or do you get mad when you win because you didnt' do enough damage?

They had to widen the ELO gap as it is in the latest patch, because people were having long search times. IMO its because of all the sync droppers and people quitting the game. What is PGI supposed to do about that? They can't change the mindset of society, its just a sign of the times. Its the same story in every pc gaming community.

LoL is the one exception, because I guess Koreans are not as big on cheating or being a whiny sore loser so they attract millions.

I still have no idea why you are responding to me. Nothing you're saying has anything to do with what I've been saying. I think you're confusing me with someone else earlier in the thread.

1. Elo is a proven system for head-to-head, whether that is 1v1 or 12v12. It doesn't use match score at all, it only uses win/loss. The matchmaker might somehow incorporate match score, but if it does PGI hasn't admitted it.

2. Um... sure? How is that relevant? Not sure why you think I care.

3. Again... sure? Not sure why you think I care?

My win/loss stats? How are my win/loss stats relevant in any way?

Seriously, you have me completely confused. I have no idea why you are responding to me because nothing you're saying has anything to do with the point I've been making. Which is thus:

If you find yourself in an unbalanced match, the fault for that lies with the matchmaker and not with Elo. (Assuming PGI implemented Elo correctly.)

Notice how I'm not saying that matches are necessarily unbalanced?

#135 Doctor Proctor

    Member

  • PipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 343 posts
  • LocationSouth Suburbs of Chicago, IL, USA

Posted 23 January 2014 - 09:58 AM

View PostMustrumRidcully, on 23 January 2014 - 09:43 AM, said:

But then, who on these boards knows enough about these type of statistics to really answer that?
I certainly don't, and giving the amount of people that wrote similar exhaustive math balance analysis threads like me - there can't be many more in the community. :D


Well, you could start by looking at the work that other people have done before. For example, the Trueskill system was created because it was felt that a weakness of Elo was that it didn't do a good job of rating individual performance in a game with randomized teams, which is what you predominantly see on XBox Live. As I mentioned before, if Elo was a perfect system that could handle those sorts of problems, a competing system wouldn't have been necessary.

Now, does MWO have a similar issue? Is our matchmaking system of selecting random players from a pool of potential players of varying skill to assemble two balanced teams similar to say, Halo? You could easily make an argument that it is. So, by extension, since we have a similar matchmaking system in many ways, perhaps something like Trueskill would be a better fit for us than Elo.

Think of it like the meta. Very few people have actually done the math on the effectiveness of various weapons systems. Their accuracy, damage per shot, damage per hit, damage per heat, etc.. Yet, you merely have to look at what the top tier players (those that often times did do the match and the testing, even if informally) are using to determine some of the more effective weapon combinations in the game. Similarly, we can look around to find games like MWO and then examine how they rank players to see what might be a better fit without needing a MIT statistician to do the math calculations for us.

#136 Roadkill

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,610 posts

Posted 23 January 2014 - 10:02 AM

View PostAmsro, on 23 January 2014 - 09:13 AM, said:

And in your post is the most obvious flaw with the current Elo, everyone has an Elo that has been derived from a Matchmaker that can't figure out what to do, resulting in Elo scores that don't make sense. Elo was not being calculated properly in the beginning and the "fix" for that mistake simply assumed you would have gotten the same result regardless of the terrible matchup.

All the current data is skewed by poor matchmaking, resulting in obsolete or irrelevant Elo ratings.

Incorrect.

Elo is self correcting. Even if you start out with poor assumptions and then have poor matchmaking, your Elo rating will converge correctly.

Elo doesn't care if matches are fair. It is designed to take unfair matches into account. So poor matchmaking due to mismatched Elo rankings does not affect the validity of your Elo ranking. Even poor matchmaking due to tonnage imbalances or using VOIP does not affect the ultimate validty of your Elo ranking; it just takes longer for your Elo ranking to reach your true skill level.

#137 Roadkill

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,610 posts

Posted 23 January 2014 - 10:07 AM

View PostJoseph Mallan, on 23 January 2014 - 09:25 AM, said:

But the match maker is supposed to be unbiased and uncaring. 12 Mechs v 12 Mechs. And Without knowing what the Elo scores are for the teams we don't know if the scores were even but the build match up sucked? Team A had LRMs team B had ECM! Team A rolled! Team a 12 PUGs Team B a 4 Man... Team B should win thanks to better communication in a perfect world. Elo just cannot adjust for those eventualities.

Elo doesn't need to adjust for those eventualities. The matchmaker is responsible for that.

Elo only cares if you win or lose. It doesn't care if the match was fair. It doesn't care if you did 1200 damage or 12 damage. It doesn't care if you won 12-0 or 12-11. A win is a win and a loss is a loss.

In a pretty random environment like the one that MWO creates, Elo rankings will take longer to converge on your actual skill. But they will eventually reach a stable value (within the constraints used to set up the system) and it will be accurate within the tolerances used to set up the system.

Elo rankings fluctuate by design. Every time you win or lose your ranking changes. All of the randomness that you're talking about simply increases that fluctuation, but it doesn't invalidate the system.

#138 Mechteric

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Overlord
  • Overlord
  • 7,308 posts
  • LocationRTP, NC

Posted 23 January 2014 - 10:12 AM

I have two frames of reference for my matchmaker gaming experience with MWO:

(a) before Elo

{b} after Elo


Nearly every game in time reference (a) had me up against much less skilled pilots nearly every match. Most games in time reference {b} do not have this problem. {b} > (a). Math lesson complete.

Edited by CapperDeluxe, 23 January 2014 - 10:13 AM.


#139 Amsro

    Member

  • PipPipPipPipPipPipPipPipPip
  • Overlord
  • Overlord
  • 3,436 posts
  • LocationCharging my Gauss Rifle

Posted 23 January 2014 - 10:14 AM

View PostRoadkill, on 23 January 2014 - 10:02 AM, said:

Incorrect.

Elo is self correcting. Even if you start out with poor assumptions and then have poor matchmaking, your Elo rating will converge correctly.

Elo doesn't care if matches are fair. It is designed to take unfair matches into account. So poor matchmaking due to mismatched Elo rankings does not affect the validity of your Elo ranking. Even poor matchmaking due to tonnage imbalances or using VOIP does not affect the ultimate validty of your Elo ranking; it just takes longer for your Elo ranking to reach your true skill level.


Your missing my point, PGI started by increasing Elo of those who should have been losing Elo, in essence hoards of people had the wrong Elo and yet for months the game played like this, completely mucking up Elo. The fix they implemented was bad and matchmaker has never recovered.

I'm not refering to the Elo system in general , instead to the Elo+Matchmaker in THIS game. There is a reason we have hundreds of Elo/Matchmaker is broken topics.

#140 Joseph Mallan

    ForumWarrior

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • FP Veteran - Beta 1
  • FP Veteran - Beta 1
  • 35,216 posts
  • Google+: Link
  • Facebook: Link
  • LocationMallanhold, Furillo

Posted 23 January 2014 - 10:20 AM

View PostRoadkill, on 23 January 2014 - 10:07 AM, said:

Elo doesn't need to adjust for those eventualities. The matchmaker is responsible for that.

Elo only cares if you win or lose. It doesn't care if the match was fair. It doesn't care if you did 1200 damage or 12 damage. It doesn't care if you won 12-0 or 12-11. A win is a win and a loss is a loss.

In a pretty random environment like the one that MWO creates, Elo rankings will take longer to converge on your actual skill. But they will eventually reach a stable value (within the constraints used to set up the system) and it will be accurate within the tolerances used to set up the system.

Elo rankings fluctuate by design. Every time you win or lose your ranking changes. All of the randomness that you're talking about simply increases that fluctuation, but it doesn't invalidate the system.
And that is the mistake. If I did 22 damage in a match and WE win, I get bumped up in rank for nothing! If we lose and I killed 8 enemy with 0 assists I still fall in the ranking. How is that a reflection of my performance?





13 user(s) are reading this topic

0 members, 13 guests, 0 anonymous users