Jump to content

Why Elo Doesn't Work Here


633 replies to this topic

#101 Unleashed3k

    Member

  • PipPipPipPipPipPipPip
  • Death Star
  • Death Star
  • 525 posts
  • LocationGermany

Posted 23 January 2014 - 08:16 AM

View PostArtgathan, on 23 January 2014 - 08:07 AM, said:


Yes. I know it seems counter-intuitive, but it apparently is true (based on the stats you presented). Consider an infantry section attack:

Your section is taking enemy fire, but you can't locate the enemy. Your section commander orders you to take a bound so that they can determine the enemy's location. During your bound you get shot, but the rest of your section is able to locate and destroy the enemy. Did your section win? Yes. Does it suck to be you in this scenario? Yes.

Helping your team doesn't always mean "being a bada**". Sometimes the way you help is more subtle.


he probably means that dmg/assist/kill/uav/tag etc should be considered as well to deterine players skills and match them together in a better way?

#102 Roland

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • 8,260 posts

Posted 23 January 2014 - 08:16 AM

View PostMustrumRidcully, on 23 January 2014 - 02:53 AM, said:

But a virtual player for 12 man groups is ... impractical. Too many virtual players, not enough data per virtual player.
maybe one could try it for 4 man groups and have the match-maker first to recreate "known" teams (okay, stjobe, DocBac, Sandpit and Varent in the queue? They've played before, I'll try to put them together). and then match these known teams Elo scores.

Note that in my message I point this out.
Using virtual players for groups would be a way to handle preformed groups, not the random assortment of players in a PUG. Basically, each premade group would be assigned an Elo value, and viewed as a single player.



View PostMustrumRidcully, on 23 January 2014 - 02:53 AM, said:

Ah, damn, TrueSkill is patented and not open. :D It's based on the Glicko rating system, but apparantly that's also not designed for teams, TrueSkill seems to add that aspect particularly, and that's what you find out about the least. :unsure:
http://research.micr...ll/details.aspx

Yes, and Microsoft doesn't really fully document Trueskill, since they don't really have any desire for anyone else to use it. They wrote some papers talking about certain aspects of it, but it's not really a fully published standard.

However, it (and Glicko) both help point out some of the major issues with trying to apply Elo to a multiplayer game where you are attempting to derive individual ratings from team results. It's not so much the case that Elo could never arrive at a correct rating, but rather that the complexities may push the individual influence so low that it would take a huge number of games for that signal to rise above the noise.

Especially in a system like we have here, where you end up having a negative feedback incorporated into the system.. if you lose, your Elo can drop.. but unlike a single player game, where you skill level is constant and the result of your Elo dropping only affects who you are matched AGAINST, but will also potentially affect those people who you are teamed up WITH. Thus, your elo dropping will not only decrease your opposition's skill, but will also decrease your own team's skill, which will mute the actual ability for Elo to seat you correctly.

This is why Elo, paired with the matchmaker, doesn't simply require 12 times as many games to seat you correctly compared to if it were rating a 1v1 game. There are actually numerous obfuscating features which will screw up the results.

#103 MustrumRidcully

    Member

  • PipPipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 10,644 posts

Posted 23 January 2014 - 08:19 AM

View PostRoland, on 23 January 2014 - 08:16 AM, said:

Note that in my message I point this out.
Using virtual players for groups would be a way to handle preformed groups, not the random assortment of players in a PUG. Basically, each premade group would be assigned an Elo value, and viewed as a single player.




Yes, and Microsoft doesn't really fully document Trueskill, since they don't really have any desire for anyone else to use it. They wrote some papers talking about certain aspects of it, but it's not really a fully published standard.

I wonder what PGI's chances are of getting access to it due to the fact that they are using a Microsoft License, and if that chances would change if the game becomes available on XBOX One (if that ever happens, which is mere speculation and wishful- or unwishful thinking).

#104 Unleashed3k

    Member

  • PipPipPipPipPipPipPip
  • Death Star
  • Death Star
  • 525 posts
  • LocationGermany

Posted 23 January 2014 - 08:21 AM

View PostRoland, on 23 January 2014 - 08:16 AM, said:

This is why Elo, paired with the matchmaker, doesn't simply require 12 times as many games to seat you correctly compared to if it were rating a 1v1 game. There are actually numerous obfuscating features which will screw up the results.


so for 1vs1 at least 100 games are need for a single player, here u would be set correctly after 1200 games?

#105 100mile

    Member

  • PipPipPipPipPipPipPipPip
  • Elite Founder
  • Elite Founder
  • 1,235 posts
  • LocationAlegro: Ramora Province fighting Pirates. and the occasional Drac

Posted 23 January 2014 - 08:26 AM

Here's the problem with this whole thread.....Your ELO is not based on win loss only.....It's based on how many kills you get, how many assists and who you kill...If you kill a player with a higher ELO it boosts your ELO more than if you kill one with a lower ELO...etc. etc etc....

So basic premise for this thread is wrong...

Edited by 100mile, 23 January 2014 - 08:27 AM.


#106 MustrumRidcully

    Member

  • PipPipPipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 10,644 posts

Posted 23 January 2014 - 08:28 AM

View Post100mile, on 23 January 2014 - 08:26 AM, said:

Here's the problem with this whole thread.....Your ELO is not based on win loss only.....It's based on how many kills you get, how many assists and who you kill...If you kill a player with a higher ELO it boosts your ELO more than if you kill one with a lower ELO...etc. etc etc....

So basic premise for this thread is wrong...

View Post100mile, on 23 January 2014 - 08:26 AM, said:

Here's the problem with this whole thread.....Your ELO is not based on win loss only.....It's based on how many kills you get, how many assists and who you kill...If you kill a player with a higher ELO it boosts your ELO more than if you kill one with a lower ELO...etc. etc etc....

So basic premise for this thread is wrong...

??? Where did you get that information?

#107 Roland

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • 8,260 posts

Posted 23 January 2014 - 08:28 AM

I'm not sure that Microsoft licenses Trueskill for anyone. I think they may limit its use to XBox Live games.

Ultimately, the issue isn't simply that Elo doesn't work.. because I actually think that it does have some non-trivial impact on sorting players. Generally, what I've seen is that certain players who are perhaps around my skill level... I see them quite often. So the Elo rating system seems to have some idea of what is happening.

The problem really stems more from the matchmaker, or perhaps with Elo not seating everyone correctly. Because while I'll repeatedly see certain players, there will also tend to be folks thrown in who I've never seen before, and sometimes they're obviously very new players driving trial mechs. And often, it's those players who will decide a match. When a team gets stuck with their heaviest mechs doing less than their tonnage in damage, it can be a huge burden to the team. This is one reason why some players just drive assaults all the time.. because when you drive an assault, you at least know that your assault isn't gonna be some paste eater in a trial mech. Well, unless you yourself are a paste eater in a trial mech.

#108 Roland

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • 8,260 posts

Posted 23 January 2014 - 08:36 AM

View PostBattlestar3k, on 23 January 2014 - 08:21 AM, said:


so for 1vs1 at least 100 games are need for a single player, here u would be set correctly after 1200 games?

No, because of the various factors which will obfuscate your rating, it will actually take far MORE than 1200 games. The multiplying factor isn't simply the number of players. In terms of what the actual factor would be, I actually have no idea. Some of it could be figured out through statistical analysis, but some of it starts to play into the feedback resulting from mech configurations since this isn't a game like chess where all players are "playing with the same pieces". Then you throw things like map selection and synergy between maps and mechs into it, and it becomes further obfuscated.

Mischief was operating with the belief that you could simply multiply it by 12 and it'd be the same, but that's not accurate. The number of games required would be the number required by a game like chess, multiplied by some (large) number that's much more than 12.

It's true that it could, potentially, arrive at a correct seating, but I'm not sure at what point that would happen.

View Post100mile, on 23 January 2014 - 08:26 AM, said:

Here's the problem with this whole thread.....Your ELO is not based on win loss only.....It's based on how many kills you get, how many assists and who you kill...If you kill a player with a higher ELO it boosts your ELO more than if you kill one with a lower ELO...etc. etc etc....

So basic premise for this thread is wrong...

No, you're incorrect.
Your Elo rating is based purely on whether you win or lose, and the amount your rating moves is based on who you played. That is all that goes into it. It does not account for ANY of your actual performance metrics in the game.

This is actually one of the problems... because it really should account for your match score.

Match score, as it stands now, is actually a good metric for player skill. The best players generally always have the highest match scores in any given game. Or at least, relative to their own team members.

#109 Tombstoner

    Member

  • PipPipPipPipPipPipPipPipPip
  • Bridesmaid
  • Bridesmaid
  • 2,193 posts

Posted 23 January 2014 - 08:36 AM

I agree with the OP. The elitists who look to ELO as a measure of personal skill are a joke.

If MWO was to improve on the ELO system, i think players should then be bundled into 4 groups. green, regular, veteran, elite. Break down ELO into those categories and pull players from within a higher level structure.... it creates 3 types of games

1 - green - regular
2- regular - veteran
3- veteran - elete

The pool of players are dispersed into categories that should provide for good games within there respective skill levels. That's all the ELO system is for after all.

#110 Artgathan

    Member

  • PipPipPipPipPipPipPipPip
  • Knight Errant
  • Knight Errant
  • 1,764 posts

Posted 23 January 2014 - 08:40 AM

View PostRoland, on 23 January 2014 - 08:36 AM, said:

Match score, as it stands now, is actually a good metric for player skill. The best players generally always have the highest match scores in any given game. Or at least, relative to their own team members.


I'm not sure that's entirely true, as 90% of your match score is simply how much damage you did. If anyone happens to know exactly how match score is determined (IE: a kill is worth X points, an assist is worth Y...) that would be useful information to have.

#111 RichAC

    Member

  • PipPipPipPipPipPipPip
  • 661 posts

Posted 23 January 2014 - 08:45 AM

Theres a couple problems:

1. an ELO is designed for 1v1, not team games, especially with 12 players. An ELO for a team would be more accurate for a premade like many posters have already stated. A team rating in a tournament or league. Not an individuals. This true by ELO definition and really can't be debated.

But i'm not positive how they rank players in MWO, they could take match score into account too for all we know? Is it really the same system used in chess?

Quakelive has the greatest skill rating system I ever saw in a multiplayer. It gave players a rating of 1 -100 and they divided it up into 5 tiers. They bought it from a 3rd party game company that made it, that was run by accountants, hedge fund stock brokers, and guys that ran casinos. True story.

And if you think computers can't rank players properly, You should watch that movie Moneyball. Based on a true story about how computers and a former stockbroker took over the jobs of professional scouts and changed the game of baseball as we know it today.

But the problem with quakelive, was players would Just constantly make new accounts, since its also a f2p game, and it undermines the skill rating system. Also in the beginning, ID mistakenly let players tier down if they felt like it lol. Which was ludicrous. So the playerbase really dropped off, and they had to make only 4 tiers. Now the game is pretty much dead in the water, and their amazing beautiful skill rating forumula is worthless.

2. People in this community do not play to win. They play for damage, and thats all they care about. There would prolly be less complaints if they based the skill ratings on Damage done haha. But obviously that just wouldn't feel right and its understandable why PGI would not want to encourage that.

3. The playerbase is small. The more people that quit the worse its going to get. Until eventualy pugs feel like 12 mans. the same 3 teams a night stomping the few that are left.

People quitting the game has nothing to do with PGI. Its a phenomenon of the times. People want dumbed down computer games, instant gratification. The perception of cheating and being disadvantages has alot to do with people thinking everything is their team or a cheaters fault or their pc...etc...

The pc gaming industry has been dying for these reasons for a decade now. Anyone who thinks WoW Or LoL or SC2, proves otherwise is delusional. EA Sports stopped making sports games for PC in 2005 and totally in 2008. Simply because there is noone left on the pc that has any sportsmanship. FPS games are now more popular on consoles also, and Consoles are not to blame. Neither are the game developers.

Lamer communities are. And now its not just gaming communities. The internet in general is pretty lame nowadays and it keeps getting worse.

Edited by RichAC, 23 January 2014 - 08:54 AM.


#112 Roland

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • 8,260 posts

Posted 23 January 2014 - 08:46 AM

View PostArtgathan, on 23 January 2014 - 08:40 AM, said:


I'm not sure that's entirely true, as 90% of your match score is simply how much damage you did. If anyone happens to know exactly how match score is determined (IE: a kill is worth X points, an assist is worth Y...) that would be useful information to have.

Generally, the damage you do is a key indicator of how good you are.

I know some folks don't like hearing that, but it's true.

#113 Roadkill

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,610 posts

Posted 23 January 2014 - 08:50 AM

View Postlockwoodx, on 22 January 2014 - 04:49 PM, said:

One of these days I'd love for you to make a second account just to experience how bad the average joe has it when it comes to how ELO currently works. PGI decided to cater to the elitists again by speeding up their queue times and dropping them in with low ELO players. It's a blunder that will cost them fresh blood if not corrected soon.

Again, your problem is NOT with Elo. It's with the matchmaker. Your own statement demonstrates this.

The matchmaker is choosing to drop people of differing Elo ratings into the same match. Elo isn't the problem - it just provides the ratings, and it's a proven system. PGI's matchmaker is the problem.

#114 RichAC

    Member

  • PipPipPipPipPipPipPip
  • 661 posts

Posted 23 January 2014 - 08:51 AM

View PostRoland, on 23 January 2014 - 08:46 AM, said:

Generally, the damage you do is a key indicator of how good you are.

I know some folks don't like hearing that, but it's true.


But dmg does not nescessarily help your team win, which many of the previous posters correctly pointed out. Its a selfish stat.

But that being said, W/L is the most important stat of all. But all stats should be taken into account, the more the merrier. Ask any professional sports scout.

Edited by RichAC, 23 January 2014 - 08:52 AM.


#115 Artgathan

    Member

  • PipPipPipPipPipPipPipPip
  • Knight Errant
  • Knight Errant
  • 1,764 posts

Posted 23 January 2014 - 08:51 AM

View PostRoland, on 23 January 2014 - 08:46 AM, said:

Generally, the damage you do is a key indicator of how good you are.

I know some folks don't like hearing that, but it's true.


I'm one of the people who contests it :D

It's just... hard to work with. Sure, someone who does < 100 damage every game is most likely not a great player, but someone who routinely does 1000+ damage could just be a bad shot (since you actually only need 186 damage to kill an Atlas, assuming you have to burn through all of the CT armor).

It can also lead to the (very strange) situation where instead of killing a mech players would intentionally strip it in order to pad their damage (and thus their "skill").

I'm not saying damage doesn't contribute to your skill level. I'm just wary of using it as the sole metric.

#116 Roadkill

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,610 posts

Posted 23 January 2014 - 08:55 AM

View PostSandpit, on 22 January 2014 - 04:44 PM, said:

I'm not asking how the theory works, I'm asking how it's implemented here in MWO. ELO could easily be applied to an individual skill and effort. It's based on a 1v1 theory so that means it could be applied to the individual effort as opposed to the team effort and results.

Yes, but not in the way you implied in your post. The Elo system cannot distinguish between a squeaker of a win and a ROFLstomp. You either win or you lose. That's all it uses.

The fact that you got 6 kills and did 1400 damage cannot be used by an Elo system to give you a "better" win or loss than someone who got a match score of 1.

You can do individual calculations for all of the 1v1 pairings in a game, though. I implemented a system like that once. As it turns out, though, it's a lot of work for no real improvement over just using the team's average Elo score.

#117 Unleashed3k

    Member

  • PipPipPipPipPipPipPip
  • Death Star
  • Death Star
  • 525 posts
  • LocationGermany

Posted 23 January 2014 - 08:57 AM

View PostRoland, on 23 January 2014 - 08:46 AM, said:

Generally, the damage you do is a key indicator of how good you are.

I know some folks don't like hearing that, but it's true.


sorry but thats just not right, it may be an indicator, but i can kill 5 people with >400dmg with my gausscat and on the other hand i often do 700-900dmg or more with less kills and maybe if unlucky only low assist as well.

killing a spider is implemented to be less skill in your rating as well then...?

#118 RichAC

    Member

  • PipPipPipPipPipPipPip
  • 661 posts

Posted 23 January 2014 - 09:00 AM

View PostRoadkill, on 23 January 2014 - 08:50 AM, said:

Again, your problem is NOT with Elo. It's with the matchmaker. Your own statement demonstrates this.

The matchmaker is choosing to drop people of differing Elo ratings into the same match. Elo isn't the problem - it just provides the ratings, and it's a proven system. PGI's matchmaker is the problem.


The matchmaker can't create people out of thin air like God for you to play with. Not every game is going to be perfect. Some games result in a roll regardless of how closee both teams are in ELO. Simply because of better team chemistry. PGI commented on this in their latest update for the latest patch.

Post your W/L stats before you start complaining. Or do you get mad when you win because you didnt' do enough damage?

They had to widen the ELO gap as it is in the latest patch, because people were having long search times. IMO its because of all the sync droppers and people quitting the game. What is PGI supposed to do about that? They can't change the mindset of society, its just a sign of the times. Its the same story in every pc gaming community.

LoL is the one exception, because I guess Koreans are not as big on cheating or being a whiny sore loser so they attract millions.

Edited by RichAC, 23 January 2014 - 09:03 AM.


#119 Roadkill

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,610 posts

Posted 23 January 2014 - 09:01 AM

View PostSandpit, on 22 January 2014 - 04:49 PM, said:

OR

More players can start taking up responsibility for their own skill level instead of looking to blame any and everything for losing.

Or you could try reading what you're responding to so that your response is actually relevant.

My post was in response to someone complaining about Elo setting up an unbalanced game. Elo has nothing to do with that. Elo just gives the players a skill rating. It's a proven system - if PGI implemented it properly, it works. So if you're in an unbalanced game, blame the matchmaker, not Elo.

What you say is also true, but it isn't related to what I said.

But to my point, if you suck and the matchmaker is working properly, you should be paired up against other people who suck. Even bad players can have fun games if the matchmaker sets them up correctly.

#120 Joseph Mallan

    ForumWarrior

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • FP Veteran - Beta 1
  • FP Veteran - Beta 1
  • 35,216 posts
  • Google+: Link
  • Facebook: Link
  • LocationMallanhold, Furillo

Posted 23 January 2014 - 09:03 AM

View PostArtgathan, on 23 January 2014 - 08:07 AM, said:


Yes. I know it seems counter-intuitive, but it apparently is true (based on the stats you presented). Consider an infantry section attack:

Your section is taking enemy fire, but you can't locate the enemy. Your section commander orders you to take a bound so that they can determine the enemy's location. During your bound you get shot, but the rest of your section is able to locate and destroy the enemy. Did your section win? Yes. Does it suck to be you in this scenario? Yes.

Helping your team doesn't always mean "being a bada**". Sometimes the way you help is more subtle.

You put out a nice argument for walking dead players everywhere. I don't agree with you, but I do like your presentation. :D





14 user(s) are reading this topic

0 members, 14 guests, 0 anonymous users