Jump to content

Elo System Problems.


20 replies to this topic

#1 Foo2rama

    Member

  • PipPip
  • Legendary Founder
  • Legendary Founder
  • 28 posts
  • LocationHacking the Gibson

Posted 07 February 2013 - 11:00 AM

I see an issue with the combining of ELO scores in a group drop and adjusting elo after that...

I've designed around ELO systems in the past and this is tricky business... AFIAK no one has ever done an ELO system like this where teams are averaged Let me explain why.

For simplifcation assume 4 players on each team with an equal number of games, "K" value might be a difference but not enough to offset the overall trend.

Team A
800
1300
2200
2000

and

Team B
500
1000
2000
1000

so you have 6300 vs 4500 or after average 1575 vs 1125

So in fact each player now functions with an ELO rank of his teams average.. Anyone else starting to see the problem?

If team B wins all players will get the bump as if they where a 1125 player. This will cause 2 issues It will inflate the high level player more then he should be inflated, and the low level players will get less of a bump then they should.

Conversely if team B loses. The higher level player will drop less then they should, and the lower level player will drop more then should.

Or in short, in all matches the highest ranked player will gain more for a win, and loose less for a loss. Conversely a low level player will win less, and loose more per match then they should. Even with close ELO scores on a team this effect will be in place and every game will contribute to this race to the top and bottem.


Summery

Basically if I read correctly how PGI is planning to implement this they are going to have an issue. Large numbers of players on the high ELO side and large numbers of players on the low side. It may not happen in the first month, but after 2-3 months you will start to see the problem.

I assume that PGI is running ELO calcs against current data and not seeing the issue, and there are not going to either as the timeframe is too short, esp with the match maker not functioning. As right now all players are random dropped in theory the the ELO will not have as wild a race to the top and bottem as it would once there is match making based on skill. I project it would take at least 3-4 months or more to see the race to top and bottom this way.


Notes
I'm basing the timeframes on about how many drops a day the average number of pilots in my group do and running ELO calcs vs that.

Edited by Foo2rama, 07 February 2013 - 11:00 AM.


#2 Foo2rama

    Member

  • PipPip
  • Legendary Founder
  • Legendary Founder
  • 28 posts
  • LocationHacking the Gibson

Posted 07 February 2013 - 01:32 PM

From the discussion on my forums.

Quote

Some ELO Systems have a fix to that by having two ELO systems.
WoW for instance has a "personal" score and a "team" score.
Team Score is used for match making while personal is used for ELO gains/losses.

Example:
Team A is all brand new players:
1-1500
2-1500
3-1500
4-1500
While Team B is 3 above average or higher with a bad friend.
5-1900
6-1900
7-1900
8-300
Both Teams have an equal 1500 "team score", so the winning team will gain 80 points spread amongst teammates, and teh loosing team will loose 80 reguardless of who wins due to equal "team score".

Team A Wins:
Players 1-4 each get 20 points and all are 1520 now.
Players 5-7 loose 26 points each and are all now 1874, while player 8 only looses 2 and is now at 278.

Team B Wins:
Players 1-4 each lose 20 points and all are 1500 now.
Players 5-7 gain 2 point each and are all now 1902, while player 8 gains 74 and is now at 374.



The problem is as they described it on the forums matches my OP. What this shows is individual vs other team average.

That would tend to cause a race to the middle though right? As it would tend to reduce impact of wins against mixed opponents with a High ELO, and increase lower player ELO vs mixed opponents. Or it could tend to create 2 populations based on initial rankings.

#3 semalferuzA

    Member

  • PipPipPipPipPip
  • 125 posts

Posted 07 February 2013 - 01:37 PM

MOBA type games use mixed ratings for the purposes of match making and in my experience it works pretty well there.

Also the "race to the middle" you described, isn't that how ELO systems are supposed to work that start with an established baseline, i.e. 1500?

Edited by semalferuzA, 07 February 2013 - 01:55 PM.


#4 Lord Jay

    Member

  • PipPipPip
  • Legendary Founder
  • Legendary Founder
  • 97 posts
  • LocationNashville, TN

Posted 07 February 2013 - 02:26 PM

Paul's clarification post in this thread (http://mwomercs.com/...79-matchmaking/) seems to indicate that the scoring will be handled like a two player chess match using both team's average scores.

At the end of the match the ELO math will be applied to each team's average score. The resulting value change will be applied to each team member equally.

So all players on a team will lose or gain the same number of points. An individual pilot's score will not be compared to the average ELO of the opposing team.

#5 Mackman

    Member

  • PipPipPipPipPipPipPip
  • 746 posts
  • LocationCalifornia

Posted 07 February 2013 - 02:33 PM

Am I missing something? Because League of Legends has an Elo system that seems practically identical, and they don't suffer any problem like the one you're predicting...

#6 Perillious

    Member

  • PipPip
  • 36 posts

Posted 07 February 2013 - 02:37 PM

still need to factor the mech into the calc also. A good pilot in a good custom config will beat a good pilot in a stock mech 99% of the time...

Edited by Perillious, 07 February 2013 - 02:38 PM.


#7 Foo2rama

    Member

  • PipPip
  • Legendary Founder
  • Legendary Founder
  • 28 posts
  • LocationHacking the Gibson

Posted 07 February 2013 - 02:55 PM

View PostLord Jay, on 07 February 2013 - 02:26 PM, said:

Paul's clarification post in this thread (http://mwomercs.com/...79-matchmaking/) seems to indicate that the scoring will be handled like a two player chess match using both team's average scores.

At the end of the match the ELO math will be applied to each team's average score. The resulting value change will be applied to each team member equally.

So all players on a team will lose or gain the same number of points. An individual pilot's score will not be compared to the average ELO of the opposing team.


Which is exactly what the OP said.

#8 Kyrie

    Member

  • PipPipPipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 1,271 posts

Posted 07 February 2013 - 03:00 PM

My best guess: the match-maker might control this problem by not allowing too large a spread of ELO ranges in the match. In this way you limit the "inflation" of ratings.

#9 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 07 February 2013 - 03:11 PM

High Elo people won't improve much from beating lower Elo people, hence others will rise faster than them. Low Elo people will rise more quickly, raising the average of the teams Elo score. This will work a handful of times and then the teams score will even out.

Elo will fix this issue inherently. Also remember that premade teams are something like 20% of drops, which includes people dropping with a friend or two. Organized groups trying to game Elo scores to fight weaker opponents will be rare as hens teeth and truly only succeed in screwing themselves - each victory will raise their score to some degree and eventually they'll end up dropping against much better teams, grouped or pugging and get pounded back down.

So the people on the team with high scores would see little improvement on a win but measurable decline if they lost. The guy with a low score would see a big boost if they win but little if no cost if they lost.

The high Elo people have more to lose and little to gain by letting the low Elo guy in with them. Even if they win because they're sandbagging with a low Elo guy the low guy will rise very quickly and their average will climb accordingly.

Make sense? It's a self balancing system that works at its best when protecting lower Elo players from high Elo players.

The only people who will complain are the highest ranked folks. There is no escape from your Elo. You get into a highly competitive team and kick a ton of a$$ and you'll find that grouped or pugging you're going to be playing with the best, most competitive folks. For the rest of us it means that you'll end up with people who probably play about like you do and the way you do.

#10 Kyrie

    Member

  • PipPipPipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 1,271 posts

Posted 07 February 2013 - 03:17 PM

To elaborate further, it really would make no sense to have someone who is 2000+ in the same match with someone who is under 1000. If I understand the purpose of the proposed MM it is to match people up based on similar skill levels, this can only be accomplished by reducing the potential ELO spread (once the rating system has stabilized ratings).

Taking chess as an example, tournaments are often divided into "classes" based on the overall number of people signed up. At the local club I play, its usually "Class A and higher" and everyone else. A quick breakdown of typical chess ELO brackets:

2200+ Master and beyond
2000-2199: Expert
1800-1999: Class A
1600-1799: Class B

and so on.

The key in this system is the 200 points separating one class from the next up until the Master level (2200+). If the match maker is to serve its purpose, the highest spread it should allow is 400 points at the most.

Edited by Kyrie, 07 February 2013 - 03:17 PM.


#11 Lukoi Banacek

    Member

  • PipPipPipPipPipPipPipPipPip
  • WC 2018 Top 12 Qualifier
  • WC 2018 Top 12 Qualifier
  • 4,353 posts

Posted 07 February 2013 - 03:26 PM

View PostFoo2rama, on 07 February 2013 - 11:00 AM, said:

I see an issue with the combining of ELO scores in a group drop and adjusting elo after that...

I've designed around ELO systems in the past and this is tricky business... AFIAK no one has ever done an ELO system like this where teams are averaged Let me explain why.

For simplifcation assume 4 players on each team with an equal number of games, "K" value might be a difference but not enough to offset the overall trend.

Team A
800
1300
2200
2000

and

Team B
500
1000
2000
1000

so you have 6300 vs 4500 or after average 1575 vs 1125

So in fact each player now functions with an ELO rank of his teams average.. Anyone else starting to see the problem?

If team B wins all players will get the bump as if they where a 1125 player. This will cause 2 issues It will inflate the high level player more then he should be inflated, and the low level players will get less of a bump then they should.

Conversely if team B loses. The higher level player will drop less then they should, and the lower level player will drop more then should.

Or in short, in all matches the highest ranked player will gain more for a win, and loose less for a loss. Conversely a low level player will win less, and loose more per match then they should. Even with close ELO scores on a team this effect will be in place and every game will contribute to this race to the top and bottem.


Summery

Basically if I read correctly how PGI is planning to implement this they are going to have an issue. Large numbers of players on the high ELO side and large numbers of players on the low side. It may not happen in the first month, but after 2-3 months you will start to see the problem.

I assume that PGI is running ELO calcs against current data and not seeing the issue, and there are not going to either as the timeframe is too short, esp with the match maker not functioning. As right now all players are random dropped in theory the the ELO will not have as wild a race to the top and bottem as it would once there is match making based on skill. I project it would take at least 3-4 months or more to see the race to top and bottom this way.


Notes
I'm basing the timeframes on about how many drops a day the average number of pilots in my group do and running ELO calcs vs that.


They're in the metrics building stage...can we at least let it go live and see how it goes for a bit before we start stressing it?

#12 Zongoose

    Member

  • PipPipPip
  • Wrath
  • Wrath
  • 89 posts
  • LocationSouthampton

Posted 07 February 2013 - 03:28 PM

View PostKyrie, on 07 February 2013 - 03:17 PM, said:

The key in this system is the 200 points separating one class from the next up until the Master level (2200+). If the match maker is to serve its purpose, the highest spread it should allow is 400 points at the most.


The problem comes in that the match maker cannot restrict the spread entirely by itself. Pre-made groups will be able to have low and high ELO players in the same match, making the spread wider than it normally would be. THis should be self correcting though as very soon either the higher ELO players will be brought down by playing with a low ELO player and losing or the low player will be brought up if they win. While the odd 1 or 2 games might have some imbalanced player skill I think the system will help in 90% or more of all matches be more even.

I'm curious if the ELO system will work alongside the "class" mech distribution (equal numbers of each class per side) as it is or if this will also be tweaked.

#13 Bhael Fire

    Banned - Cheating

  • PipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 4,002 posts
  • Twitter: Link
  • Twitch: Link
  • LocationThe Outback wastes of planet Outreach.

Posted 07 February 2013 - 03:35 PM

This is why I think matches should be premade vs premade only...and PUG vs PUG only.

Elo system should be reserved for the PUGs, since they are the ones that suffer the most from poor match making.

#14 Hekalite

    Member

  • PipPipPipPipPipPip
  • 424 posts

Posted 07 February 2013 - 04:03 PM

Re-read Paul's MM thread. In the OP's example, if team B loses, which is what the prediction model suspects, their ELO will not be affected. You only drop in rating if you are beaten by a lower ranked team.

#15 Davers

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 9,886 posts
  • Facebook: Link
  • LocationCanada

Posted 07 February 2013 - 04:09 PM

What I would like to see is Elo modified by chassis and if you are in a group. Then they could take out matching weight classes and 4 man/8 man groups.

#16 Foo2rama

    Member

  • PipPip
  • Legendary Founder
  • Legendary Founder
  • 28 posts
  • LocationHacking the Gibson

Posted 07 February 2013 - 04:11 PM

View PostDavers, on 07 February 2013 - 04:09 PM, said:

What I would like to see is Elo modified by chassis and if you are in a group. Then they could take out matching weight classes and 4 man/8 man groups.

I get way more kills in light and medium mechs then heavy and assault...

#17 Davers

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 9,886 posts
  • Facebook: Link
  • LocationCanada

Posted 07 February 2013 - 04:13 PM

View PostFoo2rama, on 07 February 2013 - 04:11 PM, said:

I get way more kills in light and medium mechs then heavy and assault...

If that is universal, then light and medium mechs would raise your Elo more than assault and heavy mechs would.

#18 Kyrie

    Member

  • PipPipPipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 1,271 posts

Posted 07 February 2013 - 04:19 PM

View PostZongoose, on 07 February 2013 - 03:28 PM, said:


The problem comes in that the match maker cannot restrict the spread entirely by itself. Pre-made groups will be able to have low and high ELO players in the same match, making the spread wider than it normally would be. THis should be self correcting though as very soon either the higher ELO players will be brought down by playing with a low ELO player and losing or the low player will be brought up if they win. While the odd 1 or 2 games might have some imbalanced player skill I think the system will help in 90% or more of all matches be more even.

I'm curious if the ELO system will work alongside the "class" mech distribution (equal numbers of each class per side) as it is or if this will also be tweaked.


I tend to agree that there should be separate ratings for PUGs, 4mans and 8mans. They are significantly different in terms of actual play as well.

#19 p4r4g0n

    Member

  • PipPipPipPipPipPipPipPip
  • Knight Errant
  • 1,511 posts
  • LocationMalaysia

Posted 07 February 2013 - 05:27 PM

Considering that a PUG team is made up of whichever player or groups click on launch at approximately the same time, I'm not sure I really see the problem with team averaging to determine whether the individual's Elo rating moves up or down. Sure, it will skew it somewhat in one match but in the long run, it should even out.

I'm making a distinction here between how teams are matched and how the team average Elo affects their gain / loss at the end of the match as PGI hasn't really said, afaik, how Elo is used to match the teams e.g. total Elo? Average Elo? Is there a modifier to team Elo if there's a group in it? Is the group modifier, if any, subject to group size?

Quoted from this thread ->

View PostKarl Berg, on 06 February 2013 - 04:41 PM, said:

In fact we do keep multiple ELO's for each player. For the moment at least it's based off the properties of the mech you're dropping with, rather than your teams structure.


Which seems to imply a modifier to an individual's Elo rating when calculating team total Elo and average.

As far as Elo spread is concerned, this I think it should be subject to further study. Implementing limits to the spread could lead to more evenly balanced Elo ratings within a team but result in Elo ratings being stuck within a certain band for a looonnnnggg time which is probably not a good thing game-wise.

Edited by p4r4g0n, 07 February 2013 - 05:33 PM.


#20 Carnivoris

    Member

  • PipPipPipPipPipPip
  • 463 posts

Posted 07 February 2013 - 05:30 PM

I wish I was good enough at math to argue one way or another. Have fun, gents.





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users