Grits N Gravy, on 13 November 2013 - 12:19 PM, said:
Elo works better for people who primarily run in 4 man groups. It works decently if you're a mediocre and below skilled solo dropper. It's painful if you're an above average skilled solo dropper.
If you run primarily 4 mans, you will get to a point where you can expect to run into mostly teams with other groups of 4 and the occasional pug. Elo works well here because it keeps groups of 4 out of to many PUG stomps. This is the primary reason why people reported better matchmaking after Elo implementation. Elo works decently if you're mediocre and below because you see a lot less of groups of coordinated players. These two groups will tend to run more stable results and not be as streaky. Over a given period of time the standard deviation of their Elo score will tend to be smaller.
Skilled players dropping solo will tend to run more streaky. The range of their Elo scores over a period of time will have a larger standard deviation. The top range of their possible matches, is the lower match range of the 4 man sets. Since this area of convergence has less of the 4 man groups in it, only one team ends up with a coordinated 4 man.
Thus the skilled solo dropper ends up loosing a bunch more up here and yet ends up with the same Elo score. Losses to a higher ranked opponents does not drive Elo scores rapidly, even with a K factor of 50. It is possible that a skilled solo players end up getting beaten 6 times, then winning 4 and arrives at the same Elo as where he started. And that's conservative, it possible to maintain an Elo score while only winning 33% of your games. This is what I call Elo hell, getting stuck in a cycle where 1/3 keeps you in same place, going 1/3 over and over.
Elo hell is a function of wide variance in matchmaking parameters, IE allowing two of vastly different Elo scores to play each other and large K factors. The issue can largely be mitigated in two ways. Switching from the logistic formulation of Elo, that MWO uses, to the Gaussian formulation and tightening the K factor. The result will be greater population density at scores within 1 standard deviation of the mean. Which will allow you to tighten your match making criteria, without increasing wait times.
Okay. I am game with that. I would also strongly, STRONGLY recommend splitting premade and pug Elo. There's just no way you can mix the two without killing variance. It's not just about the 4man player who is pugging being thrown into a match he's not going to win - it's that he's effectively throwing off the prediction for his whole teams balance and skewing Elo for the whole match. You get 3 or 4 solid premade players who are pugging in a match and you could have a margin of error for the total value of their team that could be off by 200 points or more.
For the predictions Elo does to be accurate it needs to recognize the difference in performance between a player working in a premade team vs working as a pug. The absolute worse case scenario is that people who play in premades a lot will have to play more pug matches to get their pug Elo settled correctly. Their Premade Elo, which is an aggregate, will only be as significant as it relates to their % fraction of the team i.e. 25% relevant for a 4man team, thus buffering any disadvantage it may generate for them only generating premade Elo when dropping in a premade team.
This would be largely transparent to premade players aside from making their pugging less onerous and creating less variance for the matchmaker for everyone by preventing poor estimates for a players value to his team based on premade success that doesn't translate into pug performance.
I would also disagree that it's a problem for skilled pugs. I find that in the weights where I've got a 1.3 or better win/loss while pugging I have pretty good games - most of my teammates are premades and truly 'challenged' people are pretty rare. You just follow whoever Alpha lance is, they probably have a plan and do your best to contribute.
Where it becomes hellish is when you get into the lower end of the 'mostly premade' spectrum - because those players pug too and when they do they're dropped as a pug in a match full of very predatory players and many premade teams treat pugs like meat shields. It's bitter irony but end of the day not good for the match maker.
Edited to add -
I also absolutely don't get how this wouldn't extend weight times. There is a limit to the number of players hitting 'launch' within any 120 second interval. The tighter you make the criteria for finding that match the less likely you are to hit your goals. The three search criteria are
1. Match Elo by band, as you discussed. The less deviation between Elo scores among all players the better.
2. In the absence of 1 it matches total team values by widening the bands width, low with high and such to reach a common value.
3. Tonnage between all mechs on both teams.
3 severely impacts 1s ability to find 24 matches within 120 seconds. They've said as much. That's why 2 exists as a criteria, since it's better than nothing at all. How are you proposing that tightening 1s criteria and reducing 2s implementation without considerably widening 3 won't extend search times or more to the point cause the 'drop however you can' trigger that hits at 120 seconds?
I'd also understood that the k-value changed from 50 to 5 a while ago. We started at 50 because nobody had any Elo data to speak of and it gave depth to the pool, accurate or not, from which to begin sorting ranks without creating too gradual of a curve. I could be wrong though. If so I absolutely agree that 5 is a way, way better k-value than 50.
Edited by MischiefSC, 13 November 2013 - 01:40 PM.