Jump to content

Concerning Pgis Elo Matchmaking Approach


31 replies to this topic

#1 Lyteros

    Member

  • PipPipPipPipPipPip
  • 456 posts
  • LocationGermany

Posted 31 December 2012 - 07:24 AM

This post is based on this Dev statement:
http://mwomercs.com/...ost__p__1626065


The Problem I see:
From what I read here, the new matchmaking will entirely be based on win / loss statistics.

And those statistics will be created by the WHOLE team, not taking personal achivements of each pilot into account. So your personal influence on what happens with your ELO is 1/16 (since you're one single pilot of the team and the whole other team wants your head)

Since your teams (when PUG) are randomly drawn, the team based ELO rating is just as much as a gamble as the current system.

Spoiler


I think the system will not solve the mess we have with the matchmaking. The proposed ELO misses a lot of available and easy to use information, which can vastly increase the effectivity of the ELO calculation and the match making.

Some simple examples:
Spoiler




Proposed Changes / additions:
Spoiler





Please keep in mind: This is about improvement and better game experience for everyone, not whining / rabble or personal advantages. Discuss.

Edited by Lyteros, 25 February 2013 - 09:24 AM.


#2 deadeye mcduck

    Member

  • PipPipPipPipPipPipPip
  • 735 posts
  • LocationOutside the periphery

Posted 31 December 2012 - 07:26 AM

Lets wait till phsae 3 arrives and we have had a few weeks of it before picking it apart.

#3 QuantumButler

    Member

  • PipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 4,534 posts
  • LocationTaiwan, One True China

Posted 31 December 2012 - 07:35 AM

View Postdeadeye mcduck, on 31 December 2012 - 07:26 AM, said:

Lets wait till phsae 3 arrives and we have had a few weeks of it before picking it apart.


Yes, let's not point out obvious flaws until a system is in game, this sort of thing worked so well for Diablo 3 and TOR.

OH WAIT.

Edited by QuantumButler, 31 December 2012 - 07:35 AM.


#4 Lyteros

    Member

  • PipPipPipPipPipPip
  • 456 posts
  • LocationGermany

Posted 31 December 2012 - 08:08 AM

View Postdeadeye mcduck, on 31 December 2012 - 07:26 AM, said:

Lets wait till phsae 3 arrives and we have had a few weeks of it before picking it apart.


You mean it is okay trying to implement ELO calculation for single pilots by putting up a system which calculates based on the TEAM - the team that changes each drop (even for most premades!) and furthermore ignores 90% of the available information?

I dont need to see the truck hit the wall to suggest turning or breaking.

#5 Clay Pigeon

    Member

  • PipPipPipPipPipPipPipPip
  • Mercenary Rank 3
  • Mercenary Rank 3
  • 1,121 posts

Posted 31 December 2012 - 08:11 AM

Haha! Time to smurf!

#6 Red squirrel

    Member

  • PipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 1,626 posts

Posted 31 December 2012 - 08:17 AM

Let me start with the positive arguments:

1) This system will work great for fix teams of 8
2) It should help get newbs and noobs to drop with their kind and thus will probably improve the experience for new players.
3) Also I hope so see fewer of those PUGs:

View PostDeadlyNerd, on 30 December 2012 - 03:01 PM, said:

Posted Image

Screenshot speaks for itself.




Besides this I fully agree with the OP.
It is quite difficult to evaluate mech pilots.
Did this guy just get killed randomly or did he prevent the enemy cap and secured the win?
This kind of elo will be especially problematic for players that PUG and play premade.
Since both game modes come with completely different W/D probabilities.

Edited by Red squirrel, 31 December 2012 - 08:20 AM.


#7 JPsi

    Member

  • PipPipPipPipPip
  • 177 posts

Posted 31 December 2012 - 08:17 AM

Ok, this is a known thing for Elo.

Yes win/loss is based on team. However, whats the one constant factor in every game you play? YOU.
You'll get good teams and bad ones and all the in betweens, however scaled over a long enough number of games, your Elo based on wins/losses will very quickly approach a very good indicator of your skill level.

The problem you see is one that appears to me, to stem from a lack of understanding on how Elo works in the long run. It is at its heart a statistical analysis. The more games you play, the better it will represent your skill as a pilot. This involves all things that contribute towards a win. The many teamwork based choices you make, not just simple stats that can be abused like kills/damage.

It has been applied to team based games many times now, with success for the most part. The fact that you may lose a game here or there due to your teammates will influence it minimally. The inverse is true also, there will be times when you win due to your teammates. Applied well, these factors should cancel each other out on average :)

There are other issues with an Elo system, however the argument above isn't really one of those issues.
http://en.wikipedia....o_rating_system if you'd like to read further.
http://leagueoflegen...o_rating_system The Elo system used by LoL in 5v5 team matches that has been very successful. Also some find the explanations a little better in the 2nd link.

Edited by JPsi, 31 December 2012 - 08:23 AM.


#8 Lyteros

    Member

  • PipPipPipPipPipPip
  • 456 posts
  • LocationGermany

Posted 31 December 2012 - 08:26 AM

View PostJPsi, on 31 December 2012 - 08:17 AM, said:

Ok, this is a known thing for Elo.

Yes win/loss is based on team. However, whats the one constant factor in every game you play? YOU.
You'll get good teams and bad ones and all the in betweens, however scaled over a long enough number of games, your Elo based on wins/losses will very quickly approach a very good indicator of your skill level.

The problem you see is one that appears to me, to stem from a lack of understanding on how Elo works in the long run. It is at its heart a statistical analysis. The more games you play, the better it will represent your skill as a pilot. This involves all things that contribute towards a win. The many teamwork based choices you make, not just simple stats that can be abused like kills/damage.

It has been applied to team based games many times now, with success for the most part. The fact that you may lose a game here or there due to your teammates will influence it minimally. The inverse is true also, there will be times when you win due to your teammates. Applied well, these factors should cancel each other out on average :)

http://en.wikipedia....o_rating_system if you'd like to read further.
http://leagueoflegen...o_rating_system The Elo system used by LoL in 5v5 team matches that has been very successful. Also some find the explanations a little better in the 2nd link.


Burn it down to this: Applied well. This is what I'm concerned about, because it is not applied well.

Read your own links, the example (chess) is a 1 on 1 game, so you influence the total game by yourself and all is in your hand.

The system here is setting up your rating based on the teams you play with, for PUG its entirely different each time. So you get balanced on something you can influence with 1/16 of the total. This is not a improvement on the current system, it's the same gamble who you drop with. Also the system ignores the mechs and setups completely, among other useful information.

In other words: This system continues It's win/loss roulette.

Edited by Lyteros, 31 December 2012 - 08:27 AM.


#9 Jonathan Paine

    Member

  • PipPipPipPipPipPipPipPip
  • Survivor
  • Survivor
  • 1,197 posts

Posted 31 December 2012 - 08:27 AM

I completely agree with JPsi, and would just like to add my view on ELO and teams. Yes, a mediocre player like myself would get an inflated ELO while running with good team mates - however, the combined ELO rating of the team would make perfect sense. The totality of ELO for the team will be a good predictor of the strength of the team and how likely their side is to win. When the mediocre player decides to play outside the team, and face off with players who got their ELO rating playing in PUGS, the mediocre player will be more likely to lose, thus losing ELO.

#10 Lyteros

    Member

  • PipPipPipPipPipPip
  • 456 posts
  • LocationGermany

Posted 31 December 2012 - 08:33 AM

View PostJonathan Paine, on 31 December 2012 - 08:27 AM, said:

I completely agree with JPsi, and would just like to add my view on ELO and teams. Yes, a mediocre player like myself would get an inflated ELO while running with good team mates - however, the combined ELO rating of the team would make perfect sense. The totality of ELO for the team will be a good predictor of the strength of the team and how likely their side is to win. When the mediocre player decides to play outside the team, and face off with players who got their ELO rating playing in PUGS, the mediocre player will be more likely to lose, thus losing ELO.


Have you tought about the other players in your team, that get losses because your ELO was wrong, which in turn will modify the ELO of everyone on your team? So your wrong ELO negatively impacts and falsifies the ELO of everyone in your team (and actually even the opposing team).

This will be all over the place, breaking pretty much every ELO.

#11 JPsi

    Member

  • PipPipPipPipPip
  • 177 posts

Posted 31 December 2012 - 08:34 AM

I also gave you an example using LoL. A 5v5 team based game.

Also If you'd read further down the link it goes on to show that its currently used in football, basketball and many other sports worldwide.

Also its not such a gamble who you drop with. An Elo system will bring in players of approximately the same Elo as you. Ie. You should be playing with players who are approximately just as skilled as you are.

You keep pulling the 1/16 thing out. Strangely enough, 1/16th of an influence is more than enough to gather an assessment when you play more than 200 games.

About the only analogy this has to roulette, is people applying failed logic in probabilities and statistics.

Edit: just saw the notes on Elo stacking in teams. That has been brought up in other games already. Its easily detectable and dealt with. For example: its known that one massively worse player in these games will generally lead to a loss. To counter it is simple. If the team has a much lower ranked player, the lower ranked player will not gain from a win and the rest of the team will get a zero change to their Elo upon loss. A simple measure such as checking if one players Elo is more than 100 points lower than the rest of the team will account for it. Generally the exact details of the mechanics for checking/dealing with Elo falsification are hidden, just to make potential loopholes in the checking mechanism harder to find.

Edited by JPsi, 31 December 2012 - 08:44 AM.


#12 Lyteros

    Member

  • PipPipPipPipPipPip
  • 456 posts
  • LocationGermany

Posted 31 December 2012 - 08:42 AM

Failed logic?
You are implying things that support your point but do not exist / are not mentioned. You apply selection criteria that were not stated in the dev post, thus making wrong assumptions.

The superbowl example also is not fitting, because you have a set team that works under a constant (at least quite constant) lead with (at least seasonal) constant members of the team which in turn stay on the same positions. The team is quite constant, so it is possible and reasonable to balance teamwise.

The teams here are randomly drawn, mechs / equipment change constantly, the ELO per random draw is summarized and then your personal ELO is modified according to the result of the game.

#13 JPsi

    Member

  • PipPipPipPipPip
  • 177 posts

Posted 31 December 2012 - 08:48 AM

View PostLyteros, on 31 December 2012 - 08:42 AM, said:

Failed logic?
You are implying things that support your point but do not exist / are not mentioned. You apply selection criteria that were not stated in the dev post, thus making wrong assumptions.

The superbowl example also is not fitting, because you have a set team that works under a constant (at least quite constant) lead with (at least seasonal) constant members of the team which in turn stay on the same positions. The team is quite constant, so it is possible and reasonable to balance teamwise.

The teams here are randomly drawn, mechs / equipment change constantly, the ELO per random draw is summarized and then your personal ELO is modified according to the result of the game.


Well then again, go back to LoL, the closest thing to this. 5v5, a very large number of single players and random teams. A large variety of different characters and it all still works.

The only thing I implied was that it would try to get players of approximate equal Elo in the same game. Is that really a stretch? Its kinda the whole point of having the Elo Matchmaker system.

As for how they will deal with any kind of Elo falsification, I didn't say how they would do it, but merely tried to give example on how easy it is to deal with. How they do it is up to them.

Considering its already matched on weight class and that Elo will if there's any point of it match to similar Elos. The random factor isn't as big as you make it out to be. There's definitely argument to include in a BV system, but if its absolutely necessary is unclear.

Edited by JPsi, 31 December 2012 - 08:52 AM.


#14 p4r4g0n

    Member

  • PipPipPipPipPipPipPipPip
  • Knight Errant
  • 1,511 posts
  • LocationMalaysia

Posted 31 December 2012 - 08:51 AM

@OP I generally agree with your analysis and suspect the same outcome. However, unless I missed something, Bryan did not specifically mention in his Matchmaking post that ELO is calculated based on kill / win stats only. In fact, it appears that he very carefully avoided mentioned how this particular score is to be derived.

I believe the upcoming feature "Combat Score" mentioned in the Feature Roadmap will likely be the basis for ELO.

@JPsi My individual performance is relatively meaningless when aggregated with 7 other people to create an AVERAGE ELO for matchmaking. I believe this is the main thrust of OP's analysis even leaving aside the other factors that are ignored. You can still drown in a pool that has an average depth of 2 inches if you fall in at the deep end and can't swim.

Anyhow, Phase 3 is not too far off if the Feature Roadmap is correct so let's just see which of our expectations pan out or don't as the case may be.

#15 Lyteros

    Member

  • PipPipPipPipPipPip
  • 456 posts
  • LocationGermany

Posted 31 December 2012 - 08:55 AM

View PostJPsi, on 31 December 2012 - 08:48 AM, said:


Well then again, go back to LoL, the closest thing to this. 5v5, a very large number of single players and random teams. A large variety of different characters and it all still works.

The only thing I implied was that it would try to get players of approximate equal Elo in the same game. Is that really a stretch? Its kinda the whole point of having the Elo Matchmaker system.

As for how they will deal with any kind of Elo falsification, I didn't say how they would do it, but merely tried to give example on how easy it is to deal with. How they do it is up to them.


And the LoL ELO system features things the MWO ELO post does not even mention. Decay rates over time, Queues (now we arrived at a night vs day difference) LoL fights with a maximum of 5 players per side, which makes random deviations way less frequent then in MWO with almoust double the number.

You keep talking about flawed logic but ignore facts, make assumptions and continue comparing apples with cucumbers.

Still, what is the problem with improving things further?


@p4r4gon:

Posted Image

This is what makes me assume win/loss is the only factor, with the calculation following this picture in the dev post.

Edited by Lyteros, 31 December 2012 - 08:59 AM.


#16 Kousagi

    Member

  • PipPipPipPipPipPipPip
  • 676 posts

Posted 31 December 2012 - 08:55 AM

View PostQuantumButler, on 31 December 2012 - 07:35 AM, said:


Yes, let's not point out obvious flaws until a system is in game, this sort of thing worked so well for Diablo 3 and TOR.

OH WAIT.


Problem D3 had was blizzard.... Blizzard has never put out a good game. Much less even made their own IP. They just steal everything.

Tor's problem was two fold. EA, and they didn't listen to a single thing any beta tester said. They didn't even bother fixing the bugs that were found. Plus we all know, anything EA touchs turns to crap. RIP Bioware.

#17 JPsi

    Member

  • PipPipPipPipPip
  • 177 posts

Posted 31 December 2012 - 08:58 AM

@P4R4GON There is where I call it wrong. Your individual performance is not relatively meaningless even when aggregated in an average. Theres a little more to it than that. A well balanced Elo system doesn't just make the team out so the averages balance. Ie. By not allowing matchups with any player difference in Elo greater than a set up.

To quote your analogy, a well made Elo system with say an average of 2 inches, would damn well make sure the deep end is no deeper than 3 inches.

@Lyteros I have no problem with seeking improvement. However your assertion from your first post was "I think the system will not solve the mess we have with the matchmaking.". Thats what I've been arguing against. It may not solve it completely, but it will make headway.

As for comparisons, well obviously every game has slight differences. Thats sorta why I pointed to many, you've picked at how each one has been individually different, but the point I've been trying to make is that its managed to work for nearly all of them. Some of those having many of the same or similar issuess you've brought up. No they aren't Identical, but thats sorta why they are different games. There's nothing wrong with a comparison if it does so with good reason.

Yes there are more random factors here, I also consider them balanced out by just how quickly games go. Its a lot easier to balance out random factors when you can play a large number of games. Again by comparison, the maximum time a MWO game goes for is 15 minutes and its rather rare one takes that long.
You play 2-4 games in the time you'd ordinarily complete most other competitive games. The random factor is in part countered by this.

As has been pointed out, generally competitive games don't give exact details of how their Elo system works, the more detail given, the easier they are to abuse. In this regard its about the only thing I can't fault the devs on for not being too communicative.

Now back to the win/loss thing. I think I'll have to just quote the wikipedia article : Performance can't be measured absolutely; it can only be inferred from wins, losses, and draws against other players.

Edited by JPsi, 31 December 2012 - 09:17 AM.


#18 p4r4g0n

    Member

  • PipPipPipPipPipPipPipPip
  • Knight Errant
  • 1,511 posts
  • LocationMalaysia

Posted 31 December 2012 - 09:21 AM

View PostJPsi, on 31 December 2012 - 08:58 AM, said:

To quote your analogy, a well made Elo system with say an average of 2 inches, would damn well make sure the deep end is no deeper than 3 inches.


Although I agree that a well made ELO system should, I'm wondering to what extent such a system would result in a lot of failed to find match results considering the mix of solo, 2, 3 & 4 man groups or even in 8v8.

@Lysteros Thanks for the clarification, I misinterpreted what you meant when you said win / loss statistics in your OP.

#19 Lyteros

    Member

  • PipPipPipPipPipPip
  • 456 posts
  • LocationGermany

Posted 31 December 2012 - 09:23 AM

@JPsi
Not gonna quote this time, dunno if you're done after your 10th edit or not.

So you picking on this sentence and going at it with assumptions and implying facts? Like you implied a maximum deviation from the median for the match setup in the last post. You said it yourself: "Well made ELO system". This is what I'm all about, how it looks now, the information we have and the trend PGI has shown us, I think it is NOT Well made and try to point out falws and possible improvements.

And partially comparing with context is fine, but you're ignoring context and the core issue I'm pointing at. You paint this ELO system nicely with lots of things which would actually help the issue, but are not in place and not even mentioned.

My main problem is that the basic idea how ELO is calculed here is flawed in itself. Player ELO adjustments when players are put in random teams that change every game, with mechs they can change, with equipment they can change and THEN they get ELO rated purely on the outcome of that game.

For once I agree tough, on "no specifics on the ELO" - yet there is no need for specifics, simple mentioning that at some point something happens (like your suggested maximum deviation from the median, or trying to set up queues and draw players with close ELO into the same game if possible) will make a lot of difference here.

#20 LarkinOmega

    Member

  • PipPipPipPipPip
  • 188 posts

Posted 31 December 2012 - 09:38 AM

View Postp4r4g0n, on 31 December 2012 - 08:51 AM, said:

You can still drown in a pool that has an average depth of 2 inches if you fall in at the deep end and can't swim.

This analogy is so funny that I cannot resist replying. 5% of the time, sure, but the other 95% of the time you'd be sitting on your rear in 1 inch of water!





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users