Can Pgi Finally Scrap The Matchmaker?

#21 Fabe

Member

FP Veteran - Beta 1
1,041 posts

Posted 16 August 2013 - 05:33 AM

Another problem with the Elo is that if I recall correctly it only looks at whether or not you were on the winning team which isn't a real good indication of individual skill. Put a poor player like me on a team good enough to carry me and even if I don't actually improve my Elo score will still go up just for being on the winning team.

#22 DeadlyNerd

Member

Ace Of Spades
1,452 posts

Posted 16 August 2013 - 05:40 AM

Fabe, on 16 August 2013 - 05:33 AM, said:

Liked for honesty.

#23 Hauser

Member

Legendary Founder
976 posts

Posted 16 August 2013 - 05:42 AM

DeadlyNerd, on 16 August 2013 - 03:53 AM, said:

Look, read up on Elo rating. It's made for chess, where it's 1v1, strict playing rules, unchanged playing environment and the player solely relies on himself to win.

It works amazingly well for teams of random players actually. When looking at all possible matchups, good players are in more winning matches then bad players. You can program your own simulation for it if you're into it but here is a short explanation example:

Imagine four player, A,B,C,D playing a game of tug of war. The players have a strength of 1,2,2,3 each.

The possible match-ups are:

AB | CD = 3 | 5
AC | BD = 3 | 5
AD | BC = 2 | 2

When both sides are equally strong each has a 50% of winning.

Player A wins 16% of his games.
Player B wins 50% of his games.
Player C wins 50% of his games.
Player D wins 83% of his games.

Now if you apply Elo twice to the matches in order they are listed above, starting all players at 1300 Elo:

A(1275) B(1275) | C(1325) D(1325)
A(1250) C(1300) | B(1300) D(1350)

If won by left side: A(1275) D(1375) | B(1275) C(1275)

A(1275) B(1275) | C(1275) D(1375) no change because right team is expected to win and wins.

A(1275) C(1275) | B(1275) D(1375) no change because right team is expected to win and wins.

If won by left side: A(1275) D(1375) | B(1275) C(1275) no change because left team is expected to win and wins.

If won by right side: A(1254) D(1354) | B(1308) C(1308)

If won by right side: A(1225) D(1325) | B(1325) C(1325)

A(1225) B(1325) | C(1325) D(1325) no change because right team is expected to win and wins.

A(1225) C(1325) | B(1325) D(1325) no change because right team is expected to win and wins.

If won by left side: A(1253) D(1353) | C(1297) B(1297)

f won by right side: A(1225) D(1325) | C(1325) B(1325) no change because right team is expected to win and wins.

The possible outcomes are now

A(1275) B(1275) C(1275) D(1375)
A(1254) B(1308) C(1308) D(1354)
A(1253) B(1297) C(1297) D(1353)
A(1225) B(1325) C(1325) D(1325)

The second and third result have the players nicely shorted out. The first and the last one don't make a difference yet because based on the evidence (the matches) there is no reason to assume B and C are better then A or worse then C. Another round through the system will reduce the chances of this happening 50% for each round.

Now this is a simple system, with a simple mechanism to compute who wins, but that is besides the point. It shows that Elo works fine when comparing teams.

----

What you are probably experiencing are problems with the match maker being unable to find enough players near your skill level. This definitely needs tweaking, but its not Elo.

Edited by Hauser, 16 August 2013 - 05:43 AM.

#24 DeadlyNerd

Member

Ace Of Spades
1,452 posts

Posted 16 August 2013 - 06:05 AM

Hauser, on 16 August 2013 - 05:42 AM, said:

-snip-

First, I said matchmaker needs scrapping, not Elo. It's a proven 1v1 player rating system...

Second, you can calculate all you want, but Elo is not made for pairing up groups of individual players because of 1 basic principle. 50% chance doesn't necessarily mean that every second try will be a hit. Over a longer period of time a the hit rate might seem 1 out of 2, but when taking short intervals it's not.
Combine that with the fact that the current matchmaker takes different combinations of these ratings every time, you'll very rarely get a balanced matchup.
No calculation can prove current matchmaker works fine because no calculation is, or can be practically, done on a large scale.

This was the theoretical part.

Practical part generally disqualifies any theoretical calculations right off the bat. Aside from that, as Fabe stated, what if bad players are carried by good teams too often and then end up in high tiers.
Sure, they'll drop in rating with time, but so will those that lose a long side them and the whole process starts over.
Using Elo rating for multiple player matchups is a bad concept and, if not tweaked to the point where it doesn't resemble Elo at all, will keep causing problems.

Edited by DeadlyNerd, 16 August 2013 - 06:12 AM.

#25 Hauser

Member

Legendary Founder
976 posts

Posted 16 August 2013 - 07:16 AM

DeadlyNerd, on 16 August 2013 - 06:05 AM, said:

First, I said matchmaker needs scrapping, not Elo. It's a proven 1v1 player rating system...

Actually from your first post say that Elo can not be applied to anything but 1v1. This is pertinently wrong. As I just showed, it works just fine for teams.

DeadlyNerd, on 16 August 2013 - 06:05 AM, said:

Second, you can calculate all you want, but Elo is not made for pairing up groups of individual players because of 1 basic principle. 50% chance doesn't necessarily mean that every second try will be a hit. Over a longer period of time a the hit rate might seem 1 out of 2, but when taking short intervals it's not.

Combine that with the fact that the current matchmaker takes different combinations of these ratings every time, you'll very rarely get a balanced matchup.
No calculation can prove current matchmaker works fine because no calculation is, or can be practically, done on a large scale.

This was the theoretical part.

I wrote down the whole possibility tree for those 6 matches. I have described every possible outcome in this system for the first 6 matches. There are no other possible outcomes.

If you repeat this a few more times you'll see that on the next branch you'll have 8 outcomes, 6 of which provide the correct ordering. On the next iteration you'll have 16 outcomes, 14 of which provide the correct ordering. On those two branches that don't provide the correct ordering there is no empirically observable difference between either A, B and C or B, C and D. They deserve the same rating.

It may not provide the correct result on the short run, but that is to be expected for any measuring system. Without a certain amount of comparison you can not actually rank anybody properly. To be exact, to rate a system of n players, you need at least n log n comparisons.

I'm not talking about balance here and I haven't really gone into how the matchups are assembled. Most matches I showed are decidedly unbalanced. But for the argument you make that is not necessary. If you want to talk about how the match maker is creating unbalanced matches you are welcome too, but you should be clear about that you are talking about unbalanced matches. Right now it seems you are complaining about Elo.

DeadlyNerd, on 16 August 2013 - 06:05 AM, said:

Practical part generally disqualifies any theoretical calculations right off the bat. Aside from that, as Fabe stated, what if bad players are carried by good teams too often and then end up in high tiers.
Sure, they'll drop in rating with time, but so will those that lose a long side them and the whole process starts over.
Using Elo rating for multiple player matchups is a bad concept and, if not tweaked to the point where it doesn't resemble Elo at all, will keep causing problems.

Yes. A bad player can be carried by a good team. But let me break that down in two situations.

If the bad player is playing with 3 friends that are good enough to cary him. In that case Elo is no longer relevant as a personal rating. It simply reflects the rating of the group of 4 players.

If a bad player by sheer luck always ends up on the winning team he can indeed end up with inflated Elo. Yet the chance that this happens, the chance that a single player always ends up on the winning team without actually contributing to the win are very small. You can see this in the example I gave, the worst player is A who ends up winning only 16% of his games.

The best of Elo is that if a player with an inflated Elo isn't able to win a game he is expected to win, the system will take it as strong evidence that he has been over rated and remove a significant amount of his score, quickly setting him back to his proper rating. So even if you have a lucky streak, you'll be put into place at the first moment you don't live up to it.

#26 IceSerpent

Member

3,044 posts

Posted 16 August 2013 - 07:17 AM

Hauser, on 16 August 2013 - 05:42 AM, said:

Imagine four player, A,B,C,D playing a game of tug of war. The players have a strength of 1,2,2,3 each.

The possible match-ups are:

AB | CD = 3 | 5
AC | BD = 3 | 5
AD | BC = 2 | 2

When both sides are equally strong each has a 50% of winning.

Hauser, this only works for things like tug of war where players' "abilities" (physical strength in this case) are added together. In MWO the AD team will almost always lose to BC team because player A is completely useless and player D is unable to take out the whole team of average players B and C singlehandedly.
This is pretty much the root of the current problem with Elo matching - MM should never, ever put players with vastly different ratings on the same team.

Edited by IceSerpent, 16 August 2013 - 07:18 AM.

#27 Hauser

Member

Legendary Founder
976 posts

Posted 16 August 2013 - 10:27 AM

IceSerpent, on 16 August 2013 - 07:17 AM, said:

The actual mechanism is irrelevant. It's only there so I can run a simulation with consistent results. It's also not an example of good match making, it is terrible. It's just there to show that even with the influence of other players you can run a rating system for individual players. As long as your individual performance has some influence on the outcome you can use Elo. It just takes a while to converge.

IceSerpent, on 16 August 2013 - 07:17 AM, said:

This is pretty much the root of the current problem with Elo matching - MM should never, ever put players with vastly different ratings on the same team.

The match maker is putting people with significant difference in Elo on the same team. But that is not what it sets out to do. It tries to find 24 players near a given value. As time goes by and those players aren't available it will start accepting players that are further and further from the target value.

Since this isn't working it appears that there are not enough players around a certain Elo online at the same time. This can either be a complete lack of the right players or the players can be sucked into other less ideal matches just to fill them out.

http://mwomercs.com/...-making-update/

Quote

How does the match maker compose a teams Elo rating, is it average rating or closest to a target?

It's closest to a target value, so the match maker starts trying to make a match for an Elo of say 1300 and will pull in players to those teams closest to those values; however, as mentioned earlier within growing thresholds and those curves will be tuned. Currently it may be a bit 'sloppy' about how it's filling those buckets but over time it will be tuned to be much more precise.

We need to do this carefully over time as generally the cost of precision is time to find a match we want to slowly find a very nice balance between time to find a match and the number of matches that are correctly composed.