Roland, on 23 January 2014 - 04:51 PM, said:
Dude, just because it uses wins and losses as a basis doesn't mean its the same as Elo.
The point of the system is that they IMPROVED Elo (or rather, Glicko, which was itself an improvement of Elo) to be better able to determine player skill from team results.
I would agree that in the given matchmaking situation, Trueskill would not somehow magically fix things. The matchmaker has other fundamental problems with it.
However, at the same time, simple Elo OR Trueskill may be insufficient for this game anyway.
Both Elo and Trueskill are essentially just generalized rating algorithms... For MWO, since you aren't trying to generalize across all possible games, you could come up with a BETTER rating system which actually took into account specific aspects of the game itself.
For instance, as others have pointed out... The match score at the end of the game. THAT could be leveraged in a rating system... because the folks topping the scoreboard are generally better players.
There's really no reason to limit ourselves to only using simple win/loss for rating.
Roland, I suspect we're missing each other here. Let me clarify something.
TrueSkill or a system like that does not include a metric like match score, or your damage, or anything like that. Neither does Glicko. They use your win/loss. That's it, that's the basis.
Where they expand upon that is with the ability to leverage massive sampling values (millions of players across countless games) then can drill down to look at how likely you are to win in a specific sort of situation with specific team compositions and specific environments. That lets it more accurately seat players into the best possible matches and compare them more accurately to other players so you can say out of these 1 million players where you fit it, pretty much exactly. They can say that *in this specific match* your impact is going to be X amount as opposed to saying in general, you're about X good.
There are some tweaks to how Elo is tracked and implemented in MW:O (split pug/premade, match range not target, Gaussian distribution) that will make a big difference.
If PGI has the people with the chops to do it and they've been collecting the telemetry for it I'm all for slicing Elo more finely - track win/loss with specific players and against specific players, track performance by chassis and loadout not just weight class, have a variable rating based on those and use that to more quickly seat all further Elo rankings.
For example I love me some AC20. I got good with Orions pretty quickly because I ran them like a mini-Atlas. A more comprehensive Elo system that tracks me by chassis and loadout wouldn't need hundreds of games to seat me for my new Orion, it could take how I perform with similar loadouts, adjust for what proficiencies I've got unlocked and stick me in reasonably balanced matches after 20 or 50 rounds instead of the 300 or 500 it took to generally seat me for heavies.
That's a hell of a lot of work though Roland. It would need a team of very competent people, we're not talking 1 or 2 and it would take a pretty sleepless 12-18 months to roll live. You'd be better off paying for licensing the closest TrueSkill type model and then painstakingly tweaking variables for the MW:O environment. Your eyes would bleed from matching charts trying to do that. You'd need 5 or 6 dedicated terminals just for running reports all day long.
Hence why I say it's outside the scope of MW:O right now. I get the value of it - I don't think you're wrong at all. Realistically though Elo is the basis of all those systems and with the pbase we have not being many hundreds of thousands the ability to make precisely seeded matches is missing anyway, so why musclefuck something of that caliber into place?
Make the tweaks I recommended. Then, when UI 2.0 is done and CW is close enough to at least dream about (if it's 2016 I'll be surprised) then seed Elo by chassis and game mode. Start each chassis at weight class value - 20%. Halve the k-value (how much it's modified by win/loss vs relative ranked opponents) for the first 10 matches to give people shake-out time. Then give them a +20% k-value for the next 40 matches so by match 50 in a chassis you should be pretty accurate.
We've already essentially got a win/loss tracking for game-mode regardless of chassis. Make a 'dumb' (no k-value at all) Elo score for game-modes and take 20% of that plus 80% of your chassis Elo to give you an actual Elo for dropping in a specific game mode with a specific chassis.
The 20/80 is random, you could determine a more accurate impact by looking at ~100 matches by players with tons of matches who've dropped often in the same mechs with the same loadout and see how their performance varies by game-mode. This would give you a rough percentage of game-mode impact on overall performance against the same chassis.
All that aside though the difference that sort of fine hair splitting in matchmaking would do is minimal - there's not enough people to give the matchmaker a huge range of options to fill matches.