Jump to content

Elo Worthless


298 replies to this topic

#181 Odin

    Member

  • PipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 498 posts

Posted 16 November 2013 - 12:32 PM

View PostAbivard, on 11 November 2013 - 08:40 PM, said:

ELO assigned to individual players based off random team w/l ratio can not work in a team game where the team can and does change..

No escaping that fact.

Indeed, ELO is worthless in MWO.

Either a more complex system taking many factors besides w/l is needed, or scrap ELO and go for something else.



Simple logic behind this statement.
True.

Random teams cos random results based on many factors, of which your skill is of course one.
But you alone don't make noobs or bad players win = make more damage or kills. Behave smart on the battle ground.
So, there are drops in MWO which are simply not winnable for you, if you PUG.

Weapons nerv all around, needs more team play, thats what PGI is doing. So, its not your skill, it can't be, cos one guy never has the fire power to be the deciding factor.


ELO as is, is worthless.

Edited by Odin, 17 November 2013 - 12:16 AM.


#182 Duncan Aravain

    Member

  • PipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 416 posts
  • LocationBehind you with a sharp tool...er,mech

Posted 16 November 2013 - 12:42 PM

Roadbeer, how's that spreadsheet tracking program going?...............you may now resume the ELO seminar.

#183 Roadbeer

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 8,160 posts
  • LocationWazan, Zion Cluster

Posted 16 November 2013 - 12:44 PM

Real work has been getting in the way this month, hate when that happens.

#184 Onlystolen

    Member

  • PipPipPipPipPipPip
  • Warrior - Point 3
  • Warrior - Point 3
  • 253 posts
  • LocationFantastic Planet

Posted 16 November 2013 - 01:47 PM

Elo doesnt mean a thing when you drop 315 tons shy of the other team in conquest mode on forest colony.

#185 Abivard

    Member

  • PipPipPipPipPipPipPipPip
  • Shredder
  • 1,935 posts
  • LocationFree Rasalhague Republic

Posted 16 November 2013 - 04:46 PM

How PGI measures Elo is the problem more than the Elo system having been jury-rigged by PGI in an attempt to work with dynamically created teams.

Of course 1 fail + 1 fail = major fail. No surprise there.

Just because an Airplane can use an internal combustion engine to fly in atmosphere does not mean an internal combustion engine will work in a spaceship in outer space.

Why is it so few people can see this is about the same as what PGI is doing with Elo?

#186 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 16 November 2013 - 04:50 PM

View PostRoadbeer, on 16 November 2013 - 11:45 AM, said:

All I have left to add to this discussion.

suc·cinct: adj

1. marked by brevity and clarity; concise
2. compressed into a small area


Sorry for the walls of text, just feeling like I'm repeating the same thing again and again and again, hoping that if I try something different it will make more sense to some people.

succinct version -

Elo works. It's math. Feelings are not math. Opinions and memory are untrustworthy.

Only win/loss can be used or else it's easy to game the system.

With Elo you are dropping with/against more people closer to your skill than without it. No Elo is worse.

Sometimes you can play well and still lose. It is sad. Sometimes you play poorly and still win. It is good. Play well and win more.

There you go.

#187 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 16 November 2013 - 04:53 PM

View PostAbivard, on 16 November 2013 - 04:46 PM, said:

How PGI measures Elo is the problem more than the Elo system having been jury-rigged by PGI in an attempt to work with dynamically created teams.

Of course 1 fail + 1 fail = major fail. No surprise there.

Just because an Airplane can use an internal combustion engine to fly in atmosphere does not mean an internal combustion engine will work in a spaceship in outer space.

Why is it so few people can see this is about the same as what PGI is doing with Elo?

False equivocation.

Okay, where is your math to back it up. 50 or 100 games showing in some statistical manner that it's not actually working. At this point I'd really, really love to see that from anyone at all. We've got Roadbeer posting results and processes, I've given the actual formula for Elo as well as the math behind probability theory and statistical modeling.

Examples please.

#188 Roland

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • 8,260 posts

Posted 16 November 2013 - 05:05 PM

View PostMischiefSC, on 16 November 2013 - 04:53 PM, said:

False equivocation.

Okay, where is your math to back it up. 50 or 100 games showing in some statistical manner that it's not actually working. At this point I'd really, really love to see that from anyone at all. We've got Roadbeer posting results and processes, I've given the actual formula for Elo as well as the math behind probability theory and statistical modeling.

Examples please.

Edit... Lol, sorry, wrong thread. I thought this was the other thread about how terrible the matchmaker is. I'll leave the post though, since whatever. Terrible matchmaker.

I suspect there simply aren't enough folks left who care enough to bother compiling any data.

Honestly, there simply should never be games like the one seen above. Mismatching tonnage to such a degree, in some misguided attempt to match something as nebulous as skill, leads to bad games.

I've said this before and I'll say it again. Tonnage matching should always take precedence over all else. If I lose to someone better than me, that is fine. That is fair. Indeed, that it the definition of fair.

But if I lose because the matchmaker goofed the tonnage, then it is inherently unfair. Even if I am better than the opposition, and the tonnage was skewed intentionally to try and handicap the match, such a game will feel unfair. There are few things less fun than a game which feels unfair.

Edited by Roland, 16 November 2013 - 05:09 PM.


#189 Sarsaparilla Kid

    Member

  • PipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 664 posts
  • LocationGold Country

Posted 16 November 2013 - 06:03 PM

I, for one, have started tracking my own data, on Assault mode only, and will also look at tonnage differences between teams. Will take some time, I'm sure, but could be fun, too.

#190 MrEdweird

    Member

  • PipPipPipPipPipPip
  • The 1 Percent
  • The 1 Percent
  • 273 posts

Posted 16 November 2013 - 06:13 PM

Well, "the only thing that matters is your skill" doesn't make much sense if ELO was to put you up with people who are exactly your skill level. In that scenario, you need a random factor that would give you a chance to win, otherwise, how is your "better skill" going to win against other people's...well...equally as good skills?

#191 Abivard

    Member

  • PipPipPipPipPipPipPipPip
  • Shredder
  • 1,935 posts
  • LocationFree Rasalhague Republic

Posted 16 November 2013 - 09:07 PM

View PostMischiefSC, on 16 November 2013 - 04:53 PM, said:

False equivocation.

Okay, where is your math to back it up. 50 or 100 games showing in some statistical manner that it's not actually working. At this point I'd really, really love to see that from anyone at all. We've got Roadbeer posting results and processes, I've given the actual formula for Elo as well as the math behind probability theory and statistical modeling.

Examples please.


It is not working because it is not possible for it to work. No surprise there.

Elo = A system to rank chess players by individual ability. Can be used for many things where one on one play is involved.

MWO= Multiplayer team based game, with dynamically changing teams.

What part of 'Elo is not applicable to MWO' do you not understand?

What you are implying is the same as the statement about monkeys randomly typing out the complete works of shakespeare. it may be possible in theory, but not likely to ever happen.

It most certainly is not something that can be done consistently!

You ask for empirical evidence to support my position but in return I must accept your interpretation of these theories as they may apply to MWO with no evidence what so ever?

Yet the mathematical theorems you present refute your position. You seem to be just trying to baffle us with walls of text as well as hoping to prey upon the ignorance and pride of others.

#192 Nightfire

    Member

  • PipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 226 posts
  • LocationAustralia

Posted 16 November 2013 - 11:11 PM

View PostMischiefSC, on 16 November 2013 - 11:34 AM, said:

The phone gist of probability theory is that all the random variables injected by your 11 teammates will, over time, balance out. More variables, more time it takes. More time being, more precisely, more matches. Matches = more data.


Well yes and no. It works this way so long as the variable you are measuring is at least significant in determining the outcome you are looking to observe. Assuming that more influencing variables simply means it takes more samples for the formula to work things out is erroneous. Making changes on one variable based on the outcome of a variable not directly linked to the event you are attempting to measure means you are just playing with numbers. There is no real logic behind it.
This does make sense when you work with discrete entities (single players, static teams) because all those unmeasured variables do work in concert to achieve the measured, observable outcome. The entity in question does have a significant influence on the desired metric. Since your argument jumps around a bit, I'll come back to this point at (1).

Quote

While you don't know the relative Elo of the players in a match the matchmaker itself does. Thus it's able to use that data to award or remove points from a player based on their performance as an aggregate over numerous game.


Ok, 2 points in here. I'll address the second and come back to the first. The matchmaker may well know your Elo rank and the Elo rank of those in the team but, as I have previously stated, it is not simply an imprecise measure, it is an irrelevant measure of what you are attempting to balance. Namely, player skill.

Quote

The point that you come back to and that others have come back to is that 'a players influence on a match is insignificant'. This is absolutely and patently false. You're ~8.333% of your teams performance. In a single match the pendulum swing of probability can wash a good game out right along with a bad game. The difference is that 1 game in 12 where your specific performance was the deciding factor. Or, more to the point, over those 12 games your overall influence on the probability of winning/losing the match. The criteria by which the matchmaker pulls other people into your matches is the exact same criteria by which it does so for everyone else. Waves rise, waves fall, ocean stays the same. You're swimming in the same ocean as everyone else. In the short term it may feel like you're all swimming and going nowhere but over time those who swim harder and faster absolutely will end up further ahead while those who don't will lag further behind.


Ok, I think I can see where you are failing now.
Firstly, a single players impact on a match in a positive way (the way that actually requires skill) is not significant. I mean this in the context of statistically significant. A single player may well be 1 in 12 but that does not translate into 1 in 12 matches will come down that player's performance. It is a non sequitur; there is no basis for assuming that connection.
Additionally, while it may apply the same metrics to everyone, that doesn't make those metrics appropriate. The matchmaker is ranking this player and all other players in the match by the performance of past teams . This only makes any sense if the player in question has any significant influence on the outcome of the match. Now here's the thing that lends Elo credibility. The statistical outliers (good and bad) are going to influence the outcome of a match but there's not many of them, enough to point to and say it works though. Additionally, there are those who are just going to get, by pure random chance, at a point where they gel with the people they are playing with. You can point to these players also and state "see, the system works!"
So you have a pool of people who claim it works and works perfectly. You have another pool of people that claim it is not. Now while you can claim "just play long enough and you will get to a point where the matchmaker will make you great matches", this is true only by random chance since you require the right mix of teams , ranked by insignificant metrics, over an extended duration to get to this point.

Quote

You are exactly correct that a completely average player really isn't going to affect the win/loss rate of his team. That is in fact exactly the point - their Elo, their win/loss, will remain neutral. Put more specifically you will either win more games than you lose or conversely lose more games until you win until you're routinely matched with players with and against whom you'll have a 50/50 win/loss rate. For the broad majority this is exactly what does happen. For a small percentage at the top they'll settle in at a point where they win more than they lose and stay static, at the bottom there will be a group who loses more than they win and stay static. There's a narrow band right under the 'win more' band that will actually lose more and stay static and a tiny band right above the 'lose more' that wins more and stays static.


Ok, do you really enjoy taking things out of context? Average as in statistically average , that is not exceptional, not an outlier. You touch on another point again, which will be expanded on at (1).

Quote

The point of splitting pug and group Elo is for the exact reasons you've stated - you play differently and perform differently when playing in an organized group than when you pug. Thus your corresponding performance is different, that's why the Elo scores for them need separated. To make Elo more accurate.


Ok, so you agree that the current implementation is flawed and is not acceptable? That the case that work-a-rounds exist to circumvent matchmaker behaviour is evidence of a flawed system?
I'll settle for an acceptance on that point as a step towards common ground.

Quote

As to allowing larger player groups - grouping provides a significant advantage. I'm all for it but ideally groups should play against groups. I'm all for bringing back 8mans though. Easier to find than 12. We had the matchmaking you're talking about before and most people hated it. search 'premade' on the forums from before August. People hated it more than they hated 3PV. For a reason and a good one. That won't be making a return, nor should it.


Ok, First point: I've been playing since early closed beta, founder, etc. Please don't lecture me as if I'm not aware of the past. I know exactly what transpired and how. As previously stated, I was all for some mechanism of regulating matchmaking. I was just against Elo as the specific mechanism for many reasons that are now evident.
Second point: At no point was I advocating a return pure randomness. By stating that it won't be making a return you imply that I am advocating such. I am not; play classy if you're going to play.
Third point: If you want to search back in the forums, you will find just as much raging against Elo as you will against the era of "Pug Stomping". The raging is indicative of a problem in the system, both in the random system and in the current Elo system. I'll play hypothetical and correct me if I'm wrong. One might assume you play mostly public and as such are better served by being protected by the Elo system from co-ordinated groups. As such, your experience of Elo may well be a positive one and any challenge to it may be seen as a desire to return you to the days of being stomped. If so, I'm not advocating that but I am advocating change because while the current system is working for you, it isn't working for others.
Telling these people to "Just play more, it'll get better. You have to suffer a (long) while before you get to your sweet spot" is the equivalent of being told in the random system, "Get in TS, get in a group and learn to play better. Then your experience will be better!" Simply because others are suffering instead of you now doesn't mean you should deny their experience for the same reason the Pubs in the random system shouldn't have been. They are all players and if they have a consistently bad experience, they will just leave and from what I can see, the organised players have been doing just that since Elo came in. This isn't a good thing; don't kid yourself that it is.
Fourth point: Groups shouldn't be isolated from the public queue. I know you have a significant bias against grouped players and think they should just go play away from where you play. This isn't a jab but rather an observation of your past posts; we've clashed on this point before. Any matchmaker system that is in use should encourage organised play. In game voice is a start to creating some form of cohesion in purely public groups. Public players should be able to form ad-hoc groups without suddenly becoming the devil "group". The random public queue is just that and while people should be matched in roughly different pools, the current Elo implementation isn't the system to do it with.

Continued next post

Edited by Nightfire, 16 November 2013 - 11:23 PM.


#193 Nightfire

    Member

  • PipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 226 posts
  • LocationAustralia

Posted 16 November 2013 - 11:21 PM

Quote

So let me ask you this - do you win more matches than you lose? Yes or no. Do you win or lose more matches now than you did before Elo came in? I know that I, Roadbeer and several other people have been tracking that statistically and for us at least we can absolutely see the impact that Elo has had in the aggregate. If you are a statistical aberration then show it. Track some games and show how you're consistently losing more than you're winning while still getting tons of kills and damage every match. Not cherry picked but 20, 50, 100 consecutive games.


Actually, that's a rather loaded question in my case. Hit registration is terrible for me. I routinely get situations where I do 800+ points to 2 mechs and neither loses anything let alone one is destroyed. When the fundamental mechanics of the game are broken then looking at my win/loss is not helpful. I did track it for a while until I realised it was meaningless.
So that is: Lose more than win now (~0.6 win rate), won more than lost before Elo and yes, I know how statistics work. No I'm not sour about Elo, I'm actually more irritated at hit registration than Elo.

Quote

As to basing score on something other than win/loss -

Can't happen. Every single other metric is easily skewed. Make it match score? Fine, I'll boat LRMs or LBX and SRMs and while I'll win less games I'll do a higher average amount of damage and component destruction. KDR? No problem, PPC/AC sniping and kill stealing with push me up the charts. Win/loss is less precise a metric to measure personal performance but it's the only trustworthy one. Over enough games your impact on the performance of your team can be measured. More matches, more data, more precision.


*sigh* Yes, yes it can happen. I'll grant it is somewhat more difficult with the metrics we have been given to work with but win/loss is not even statistically representative. Pointing out how useless the metrics we have access to are as individual indicators is a straw man. Come on, surely you can think of better, albeit more complicated, metrics? For scouting roles, number of spots, number of tag assists, number of ECM counters, etc. There are many different ways you could combine stats to create ranking, many different ways it could be measured. Falling back on " Win/loss is less precise a metric to measure personal performance but it's the only trustworthy one" is lazy thinking. I challenge you to put your intellect to the problem. This would of course require acknowledging that Elo has problems.

Quote

I'm all for making Elo public as well. The bracket system you're talking about would require tracking peoples win/loss - you're literally just talking about a clumsier version of Elo. So make Elo public, hell have the end of round screen show the Elo impact of the match and give relative Elo scores next to each player in the game.


Making Elo public only serves to quell the competitive crowd for a short time. It's a flawed system from the outset for dynamic groups and while having something visible to enable people to boast, it will be quickly ignored once competitive play comes in. Then you'll see the requests from Merc. Corps for stats that really do matter.

Quote

Elo is tied to player performance and metrics - win/loss. Which you can and do impact. I absolutely get the desire to have something specific that you can play towards. Damage, score, kills. Something you can plan and build towards. The problem is that it only rewards a specific behavior as I mentioned before and is absolutely going to get abused and at the end of the day...


I think we'll have to agree to disagree. You don't impact it, not directly. The TEAM impacts this directly and as previously stated, a single (statistically average) player does not impact this in a statistically significant way. As for rewarding specific behaviours, you talk as if this is a bane!? Designing a ranking system that encourages particular styles of play (ie: Team Play) is what a good system should do!

Quote

It's winning that matters. How much do you do, every single game, to secure the win. That's what Elo measures. It does exactly what probability theory and statistical modeling says it should do and it does it as accurately as is possible given the criteria and population density.


You're right, Elo reflects winning and winning is ALL that matters. Winning is determined by the team over which you have no significant influence. Winning, as a metric, only reflects that you did/did not played better/worse with this team as the players did in the other team. It does not do what you are claiming because you are not capturing all the variables that have a significant impact on the outcome. I've seen many statistical models (particularly weather simulations) arrive at completely erroneous outcomes because of this flaw.
I'll say it again. By having your control variable (Elo) determined/influenced by a measured variable win/loss) that is not directly tied to your desired measured outcome (player skill) you are hoping for a long term correlation that may/may not eventuate.
In essence, if I may use your ocean analogy, you are hoping a weak current will over time push a boat in one direction, while assuming that stronger variables such as the wind will over time become an insignificant factor, only the wind doesn't always let up or of it ever does, for long enough. You need to also account for the wind.

View PostMischiefSC, on 16 November 2013 - 04:50 PM, said:

Sorry for the walls of text, just feeling like I'm repeating the same thing again and again and again, hoping that if I try something different it will make more sense to some people.


Perhaps there is something in that other than other people being dense? Something to think about?

Quote

succinct version -

Elo works. It's math. Feelings are not math. Opinions and memory are untrustworthy.


Not in this situation.
It is maths, it just doesn't account for the variables it should.
Opinions and memory are indeed untrustworthy as a tool for factual recall, totally agreed. That doesn't make them unimportant though. If a system in a game that is designed to enhance enjoyment leaves the impression for a good portion of the population that is doesn't, then something is wrong. Facts and statistics don't make people stay. Opinions and memory do make people leave though. Expectation management is something that should be intergal to the system.

Quote

Only win/loss can be used or else it's easy to game the system.


Disagreed. What you call gaming the system, I call a feature that can be used to encourage desired behaviour.

Quote

With Elo you are dropping with/against more people closer to your skill than without it. No Elo is worse.


Two points here.
First: Not always. As I have stated, Elo "working" is not a reliable metric.
Second: You seem to equate all opponents of Elo as advocating a return to pure random matchmaking. Not just here but in other posts also. That is black and white thinking, cluster B thinking, a "You're either with us or against us" mentality. If this is what you really think, tell me now so I can abandon this conversation because such people don't change their minds until it benefits them.


Quote

Sometimes you can play well and still lose. It is sad. Sometimes you play poorly and still win. It is good. Play well and win more.


No, have a team that plays well and win more. The rest I agree with because in the end, it comes down to the team you are put with.
(1) Now for my additional point:
You touch on this point but never actually discuss it, namely the 50/50 win/loss. You assume that this is a good thing, it isn't. Competitive players play for progress, remove the indicators of progress and they will quickly stop playing. Competitive players want to win. While it is unfortunate that for some people to win more often, others need to lose more often, competitive players accept this as long as there are clear paths to improvement. The desire behind Elo to reduce everyone to an apparent mediocrity is self defeating. Competition is something that drives competitive players and any system that removes those indicators from a competitive game is doomed to a slow death.

Edited by Nightfire, 16 November 2013 - 11:24 PM.


#194 Mystere

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 22,783 posts
  • LocationClassified

Posted 16 November 2013 - 11:44 PM

View PostRoland, on 16 November 2013 - 05:05 PM, said:

Tonnage matching should always take precedence over all else...


Isn't that what we essentially had before people started whining loudly and endlessly about not having a "proper" matchmaking system?

People should really be careful what they wish for.

#195 Sug

    Member

  • PipPipPipPipPipPipPipPipPip
  • The People's Hero
  • The People
  • 4,629 posts
  • LocationChicago

Posted 16 November 2013 - 11:53 PM

Damn. My quote got edited out of NIghtfire's response.

#196 Nightfire

    Member

  • PipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 226 posts
  • LocationAustralia

Posted 17 November 2013 - 01:10 AM

View PostSug, on 16 November 2013 - 11:53 PM, said:

Damn. My quote got edited out of NIghtfire's response.

Sorry man, it was playing silly buggers telling me there weren't a matching number of quote and /quotes.
I'll summarise what I did say though:
He didn't say that in game metrics matter all save that good in game performance is likely to help your team win more. But then Elo will likely "balance" you out with some other players that will promptly get themselves killed, dragging you and your good performance down with them. Cynical huh? ;)

#197 Ghogiel

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • CS 2021 Gold Champ
  • CS 2021 Gold Champ
  • 6,852 posts

Posted 17 November 2013 - 01:24 AM

View PostRoland, on 16 November 2013 - 05:05 PM, said:

Tonnage matching should always take precedence over all else.

I'll face scrubs in their assaults all day while I'm in my assault. They have the same weight as me so it's fair right!

#198 Roland

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • 8,260 posts

Posted 17 November 2013 - 02:08 AM

View PostGhogiel, on 17 November 2013 - 01:24 AM, said:

I'll face scrubs in their assaults all day while I'm in my assault. They have the same weight as me so it's fair right!

Honestly? Yeah, it is fair.

Fair means that the better player wins.

But if you have two groups with equal elo, and one has way more tonnage? Then no, that's not fair.

#199 KinLuu

    Member

  • PipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 1,917 posts

Posted 17 November 2013 - 02:28 AM

If you claim that you have no statistically significance on your teams chance of winning, you must have reached your true elo.
But I consider that to be very unlikely - especially in a game with such a small player pool.

Are you sure, that you do not have a statistically significant NEGATIVE impact on your teams chance of winning?
Australians are heavily disadvantaged in this game.

#200 Ghogiel

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • CS 2021 Gold Champ
  • CS 2021 Gold Champ
  • 6,852 posts

Posted 17 November 2013 - 03:41 AM

View PostRoland, on 17 November 2013 - 02:08 AM, said:

Honestly? Yeah, it is fair.

Fair means that the better player wins.

But if you have two groups with equal elo, and one has way more tonnage? Then no, that's not fair.

There is actually an Elo rating for each weight class.





12 user(s) are reading this topic

0 members, 12 guests, 0 anonymous users