Jump to content

Elo Vs Rpi Player Ratings

elo is bad mkay

46 replies to this topic

#21 Surn

    Member

  • PipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 1,073 posts
  • Twitter: Link
  • LocationSan Diego

Posted 16 May 2018 - 10:05 AM

View PostMischiefSC, on 16 May 2018 - 08:59 AM, said:

Except a matchmaker is matching people based on their ability to win matches - not do damage. LRMs do damage, lots of it. HGauss do less total but kill better. MGs are great for kill stealing but don't indicate how likely you are to.... You kniw.

Win the match.

There is no value, purpose or reason to build a matchmaker whos purpose is to balance odds of winning around anything but actual winning.

That's why Elo is a thing.


Except that is wildly in error when evaluating how well a player performs.

If a decent player beats a great player because he is using a far superior mech, ELO is junk.

If a good player beats a decent player because he has the correct config for the environment, ELO is junk.

If a bad player barely loses to a great player because he or his config or his ability to choose where to fight is improving, ELO is junk.

If a bad player loses to a great player or vice versa, ELO is fine because it is just basic.

#22 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 16 May 2018 - 10:58 AM

So if a good player play bad mechs refularly his Elo will drop to reflect that. Elo is not wildly inaccurate - it reflects how likely someone is to beat someone else. Be that because of skill or mech choice, winning is winning.

Or just give each mech an Elo score and average it based on relative performance value with the players Elo.

And FFS it's Elo, not ELO. Elo is the last name of Arpad Elo who created the equation upon which the system is based. ELO stands for Electric Light Orchestra.


Edited by MischiefSC, 16 May 2018 - 11:00 AM.


#23 Surn

    Member

  • PipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 1,073 posts
  • Twitter: Link
  • LocationSan Diego

Posted 16 May 2018 - 02:31 PM

View PostMischiefSC, on 16 May 2018 - 10:58 AM, said:

So if a good player play bad mechs refularly his Elo will drop to reflect that. Elo is not wildly inaccurate - it reflects how likely someone is to beat someone else. Be that because of skill or mech choice, winning is winning.

Or just give each mech an Elo score and average it based on relative performance value with the players Elo.

And FFS it's Elo, not ELO. Elo is the last name of Arpad Elo who created the equation upon which the system is based. ELO stands for Electric Light Orchestra.


and as i stated earlier, Elo is basic and based on assumptions of a normal distribution. We have far more data available than wins and losses, things that can account for close games, like scores...of which there are none in chess.

There is a reason college basketball uses RPI.

For God's sake, Elo has no accounting for who or what you played. Don't even get me started if there are teams involved...

Edited by Surn, 16 May 2018 - 02:37 PM.


#24 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 16 May 2018 - 03:06 PM

Except amount of data available is irrelevant. We do have a normal distribution because in every match 1 side wins, 1 side loses. Total wins = total losses, save for a draw which is loss/loss.

It does account for who you played. That's the K factor. That's why you don't gain or lose the same amount no matter who you've played.

Including any data *except* win/loss in a matchmaker will create an inherently broken matchmaker.

NCAA uses RPI because everyone isn't in the same pool. It is used to correlate teams that play regionally with national rankings. Its is also 75% based on what is functionally an Elo system without a K factor - win/loss. It is also acknowledged as a flawed predictor specifically because of this - however it is intentionally flawed as a predictor because that makes it harder to try and manipulate from a betting/book making perspective.

RPI is literally acknowledged by the NAACP as a statistically inferior system but they use it for other reasons.

#25 Y E O N N E

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Nimble
  • The Nimble
  • 16,810 posts

Posted 16 May 2018 - 04:40 PM

View PostSurn, on 16 May 2018 - 10:05 AM, said:


Except that is wildly in error when evaluating how well a player performs.

If a decent player beats a great player because he is using a far superior mech, ELO is junk.

If a good player beats a decent player because he has the correct config for the environment, ELO is junk.

If a bad player barely loses to a great player because he or his config or his ability to choose where to fight is improving, ELO is junk.

If a bad player loses to a great player or vice versa, ELO is fine because it is just basic.


As every single one of those things is under the player's control, I fail to see how any of that renders Elo to be junk.

#26 Surn

    Member

  • PipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 1,073 posts
  • Twitter: Link
  • LocationSan Diego

Posted 16 May 2018 - 08:33 PM

View PostMischiefSC, on 16 May 2018 - 03:06 PM, said:

Except amount of data available is irrelevant. We do have a normal distribution because in every match 1 side wins, 1 side loses. Total wins = total losses, save for a draw which is loss/loss.

It does account for who you played. That's the K factor. That's why you don't gain or lose the same amount no matter who you've played.

Including any data *except* win/loss in a matchmaker will create an inherently broken matchmaker.

NCAA uses RPI because everyone isn't in the same pool. It is used to correlate teams that play regionally with national rankings. Its is also 75% based on what is functionally an Elo system without a K factor - win/loss. It is also acknowledged as a flawed predictor specifically because of this - however it is intentionally flawed as a predictor because that makes it harder to try and manipulate from a betting/book making perspective.

RPI is literally acknowledged by the NAACP as a statistically inferior system but they use it for other reasons.


The NAACP, lol you mean the NCAA. No where is RPI declared inferior to Elo, that is just not true. Nothing is perfect, but it is such a superior system, especially when applied to players on teams, that it is confounding that you don't understand it.

Further, because we have far more factors at our fingertips than just w/l and strength of schedule, our representation of w/l can be much more accurate than 75%, which is targeted at a 25 + game sample. Thus, the sos is relative between teams, so small differences still allow a ranking of sos in college basketball.

As to different pools, the flexibility to account for different environments is fundamental to a battle simulation.

#27 Naqser

    Member

  • PipPipPip
  • The Deadly
  • The Deadly
  • 60 posts

Posted 17 May 2018 - 10:58 AM

View PostSurn, on 15 May 2018 - 02:12 PM, said:


Again, already have 10years of experience with this in a MechWarrior based league. Making a combat RPI is just a matter of integrating damage done and changing from tonnage to tonnage with a chassis and model modifier.


And as I said, the more factors you implement into an algorithm, the more likely it is that the algorithm become exploitable.

Just implementing damage alone slightly shift the focus on winning, to doing damage as well, which is partly a selfish thing.
Don't know if you've played Halo: Reach on Launch, but it had a skill based Playlist which was team based but the ranking was done per individual. Why did they try that? Because people complained about the crap players being carried, and crap players bringing their team down. What happened in the playlist? The troll's dream. It was a team based FFA playlist. There were a few players you weren't allowed to kill, but you competed against them either way. A few months in they changed the ranking to a W/L system.

If you shift focus from W/L, to other things, players tend to start min-maxing those things as well.
"I need to do a lot of damage because that helps my rank"
"If I play this way ( which may not help the team at all ), I can get a lot of that specific thing and my skill rank will be good for it"

#28 BTGbullseye

    Member

  • PipPipPipPipPipPipPipPip
  • The Solitary
  • The Solitary
  • 1,540 posts
  • LocationI'm still pissed about ATMs having a minimum range.

Posted 17 May 2018 - 08:38 PM

And a pure win/loss rating for anything other than determining if you won or lost, is useless. It says very little about skill when taken out of context, like with what mech you're using. (a meta Anni takes a lot less skill to win with than a BlAsp)

#29 Naqser

    Member

  • PipPipPip
  • The Deadly
  • The Deadly
  • 60 posts

Posted 17 May 2018 - 09:57 PM

View PostSurn, on 16 May 2018 - 02:31 PM, said:

and as i stated earlier, Elo is basic and based on assumptions of a normal distribution. We have far more data available than wins and losses, things that can account for close games, like scores...of which there are none in chess.


-Pieces lost
-Pieces removed
-Pieces left compared to opponent
-Which pieces were removed
-Which pieces were lost
-Which pieces were lost to what pieces
-Which pieces were removed with what piece
-How many rounds were played

View PostBTGbullseye, on 17 May 2018 - 08:38 PM, said:

And a pure win/loss rating for anything other than determining if you won or lost, is useless. It says very little about skill when taken out of context, like with what mech you're using. (a meta Anni takes a lot less skill to win with than a BlAsp)


It's not a rare occurrence to see long range assault mechs do a lot of damage, but still lost the team the game, because these assault mechs would've been of better use sharing armor at the front than stand in the back accumulating a high damage. Something they were capable of doing as they're usually some of the last remaining players and have more time to dish out damage.

And here's the question regarding the game:
Are you there to win?
Why not take the Meta Anni then?

Your ability to win, is making less mistakes than your opponents.

Edit: Even in the OP, it states that someone can gain rank because they "did good amount of damage".
"Hey, you don't need to win, you just need to pad your stats"
As with Halo: Reach's initial ranking system, that promote selfish playstyles which does not benefit the team.
And the more variables you put into an algorithm, the more complicated it'll become to balance, and easier to exploit.

Edited by Naqser, 19 May 2018 - 02:23 AM.


#30 Surn

    Member

  • PipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 1,073 posts
  • Twitter: Link
  • LocationSan Diego

Posted 19 May 2018 - 02:34 PM

I think we all understand your concern. That something other than winning or losing will matter.

That is exactly correct, we want to represent how well you played.

#31 Surn

    Member

  • PipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 1,073 posts
  • Twitter: Link
  • LocationSan Diego

Posted 19 May 2018 - 02:58 PM

I should add, that an RPI system could reduce the queues to 1.

The mech you use could determine which division your stats accumulate to, and if you take on a Div 1 in a Div 7 it would adjust your RPI calculation appropriately.

#32 Naqser

    Member

  • PipPipPip
  • The Deadly
  • The Deadly
  • 60 posts

Posted 20 May 2018 - 01:00 AM

View PostSurn, on 19 May 2018 - 02:34 PM, said:

I think we all understand your concern. That something other than winning or losing will matter.

That is exactly correct, we want to represent how well you played.


Winning and gaining a high skill rank represents that you're a good player.
A high rank with an algorithm using different stats show you're good at padding those stats.

#33 BTGbullseye

    Member

  • PipPipPipPipPipPipPipPip
  • The Solitary
  • The Solitary
  • 1,540 posts
  • LocationI'm still pissed about ATMs having a minimum range.

Posted 20 May 2018 - 03:09 AM

Not when the individual stats have rapidly diminishing returns. You need all the stats to have a significant improvement of your RPI.

#34 Naqser

    Member

  • PipPipPip
  • The Deadly
  • The Deadly
  • 60 posts

Posted 21 May 2018 - 05:45 AM

View PostBTGbullseye, on 20 May 2018 - 03:09 AM, said:

Not when the individual stats have rapidly diminishing returns. You need all the stats to have a significant improvement of your RPI.


So stats should matter, and not matter.
W/L encompass everything that you did in order to come out on top, or didn't.

#35 Popcat

    Member

  • PipPipPip
  • The Shogun
  • The Shogun
  • 74 posts

Posted 21 May 2018 - 08:24 PM

Nietzsche is dead - God

That hit me just right an made me burst out laughing at work . . thanks :P

#36 Throe

    Member

  • PipPipPipPipPipPipPipPip
  • The Marauder
  • The Marauder
  • 1,027 posts

Posted 21 May 2018 - 11:28 PM

[deleted by user]

Edited by Throe, 09 November 2018 - 12:35 PM.


#37 RickySpanish

    Member

  • PipPipPipPipPipPipPipPipPip
  • Veteran Founder
  • Veteran Founder
  • 3,510 posts
  • LocationWubbing your comrades

Posted 26 May 2018 - 01:21 PM

There are so many variables to what dictates a "close" match, that trying to account for them in the outcome of a battle would lead to all sorts of fringe cases and manipulation. Elo works perfectly fine in a 1v1 scenario where the only thing that matters, is whether or not you win. The division system while flawed, was designed to mitigate inherent imbalance between 'Mechs. It's actually a good idea, in that the developers can deter "stomps" without affecting the match maker / scoring equation, it eliminates the need for some sort of arbitrary weighting for battles / 'Mechs that the RPI system implies. The biggest problems with S7 are the lack of players, poor queue to game time ratio, and janky divisions. What the developers need isn't a worse, more complex and exploitable ranking system, they need to run S7 events and to seriously improve rewards to draw players into the game.

#38 BTGbullseye

    Member

  • PipPipPipPipPipPipPipPip
  • The Solitary
  • The Solitary
  • 1,540 posts
  • LocationI'm still pissed about ATMs having a minimum range.

Posted 27 May 2018 - 03:46 AM

Except that RPI is inherently LESS exploitable, since it doesn't allow outliers to overrun the total score.

#39 Brauer

    Member

  • PipPipPipPipPipPipPipPip
  • 1,066 posts

Posted 27 May 2018 - 10:43 AM

View PostBTGbullseye, on 27 May 2018 - 03:46 AM, said:

Except that RPI is inherently LESS exploitable, since it doesn't allow outliers to overrun the total score.


How is RPI less exploitable for ranking people in 1v1s? From my perspective all that matters is if you win or lose. Doing anything beyond adjusting the rise/fall of Elo based on relative ranking muddies the water. The only exploitation I can think of for Elo would be staying in queus against far worse players, or doing something against TOS. Both would be fairly strongly limited by the low Elo rise a high ranked player would get from beating low-ranked players.

Again, I am viewing choosing and building the right mech for the division as part of a player's overall skill, as well as getting the most out of that mech. In a perfect world giving players an Elo for each mech they own might make sense, though could be invalidated by people changing builds. But, even that would run into issues when balance updates are made and/or metas shift.

#40 Naqser

    Member

  • PipPipPip
  • The Deadly
  • The Deadly
  • 60 posts

Posted 27 May 2018 - 08:12 PM

View PostBTGbullseye, on 27 May 2018 - 03:46 AM, said:

Except that RPI is inherently LESS exploitable, since it doesn't allow outliers to overrun the total score.


Only "exploitable" if you hold firm that choosing and equipping mechs aren't part of your skill.
Doing more damage than your opponent yet still lose easily means you spread your damage, and couldn't torso twist good enough.





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users