Posted 08 October 2014 - 11:28 PM
				
				
				
					Actually, ELO works quite well in this game - perhaps too well *more on that later.
 
Every few days I seem to see someone suggesting a different way of calculating player skill, rather than basing it off of wins and losses.
 
The most common argument seems to be of the form:
"I did 900 damage, but we lost - why should my ELO go down."
or the converse:
"Someone does 10 damage but their team wins, why should their ELO go up"
 
 
 
Well, what the matchmaker is trying to make a game where each team has a fairly equal chance of winning.
Guess what predicts future wins best: Past wins. That's right, I said it.
 
Given a large enough sample size, actual wins will predict future wins better than any ridiculously complex formula that you could even dream of building.
 
Consider the badass NARC mech, who can turn the tide of a battle while dealing a mere 100 damage. But it's not just the NARC assists, its more. This little badass narcs the biggest threat. Not just the most dangerous mech, or the most dangerous pilot - but a true reading of all the factors. Its not the (NARC assist damage) * (Enemy Player ELO rating) ratio, its better than that. He knows that even the most notorious badass of a pilot may not be the biggest threat when they are in a build not suited to the map. He knows enemy psychology too. Enemy players fall off the hill in fear, almost as if the narc was stuck directly to their fragile egos.
Now tell me, how the hell are you going to calculate this based on his average damage of a couple hundred per match?
I'll tell you how - he wins all the time. Hes so damn good that LRM noobs break their own personal best just by virtue of his presence.
You can tell because whenever he is around, the red team seems to crumble and fail.
 
What about the pilot that boldly charges in and causes chaos behind enemy lines. Sure, her Jenner often gets toasted early on, and she is seldom the last one alive... but how many mechs get shot in the back by the assault mechs on her team, because they were fixated on the jenner instead of the 500 tons of mechs just behind a ridge.
Is she the same as the other pilot who averages the same 350 a game by hiding until then end and cleaning up damaged mechs? How do you quantify her skill?
Maybe is the fact that any team with her on it is that much more likely to win.
 
What about that basass sniper who does clean damage, and only clean damage? You know this guy. He's the one who shoots the leg off your stormcrow with 3 back to back gauss shots. Funny thing is, he doesn't finish you. He's above that, and he knows it. There are plenty of scrubs on the team with bad aim that will be totally sufficient to finish off a 'crow with a broken wing. He's never killed a stick mech unless its the only target in sight, either, except that ******* that was spotting him. Is he not as good as the bullet-hose wielding pilot next to him, who spreads damage all over and overheats to kill an unarmed 'mech while another enemy decimates his nearby teammate? They score similar damage, and have similar numbers of kills and deaths, even though the other player steals his kills and hides to protect his KDR. How can you mathematically separate these pilots?
You can tell easily. The Sniper... HE WINS!
 
 
 
 
 
Now granted, if your first match is a 900 pointer and you lose, it would probably be safe to say that you are better than someone who won with 100. But this is an extremely temporary phenomenon which basically disappears once there is a sufficient sample size.
 
Something like this could probably be implemented to help slightly speed the convergence of a newly joined player to their actual ELO score. But it could only make things worse for us seasoned vets with 100s of games in each weight class.
 
Same thing for weight classes - nothing predicts assault performance like assault performance, but for the 2.0 win/loss ratio heavy pilot, it would probably be safe to assume that his first match in an assault 'mech will be a little more impressive than the average pilots first foray into the 80+ ton range.
 
But get 100 assault samples under the pilots belt, and you will have an even better idea. Certainly it is doubtful that the assault 'Mech tourney champion will be an inept noob in any chassis, but light mechs might not be his thing.
 
 
 
 
 
 
 
Unfortuately, unified ELO across a weight class is the cause to the issue where good players feel that they NEED to take "carry mechs".
ELO is expecting them to be a major force in the game... unfortunately, when your ELO is based on a Banshee, bad things might happen to your team when you step down to an awesome.
 
I know I feel this one when I try to pilot my X5. ELO, let me assure you that your expectations for me are perhaps a bit high.
 
There is, however, a reason that ELO is tracked for 4 classes rather than for each chassis. It helps resolve the problem of needing a large-ish sample size for a bunch of different mechs, which would add unwanted variation. Also, consider the case of a new pilot who starts off in a Shadowhawk, and then puts it away for awhile while grinding his X5. By the time he goes back to the Shadowhawk, not only is he far more skilled, but his shadowhawk actually has a lower ELO than his X5. All the sudden he thinks he has found the greatest shadowhawk build ever... until ELO catches up.
 
The best solution to this would perhaps be to implement some sort of modifier to ELO based on chassis weight relative to the weight class.
 
The purpose is to acknowlege that the following is not actually a fair matchup:
 
Team A: (Heavy ELO 1300 pilot in a Cataphract + Heavy ELO 500 pilot in a Dragon) total 130 tons, 1800 ELO
vs.
Team B: (Heavy ELO 500 pilot in a Cataphract + Heavy ELO 1300 pilot in a Dragon) total 130 tons, 1800 ELO
 
Team A is going to win more often, because the Cataphract is more likely to be an influential mech.
 
 
But what if the dragon multiplied elo by .8 since it is a lightweight in its size bracket?
 
Team A: (Heavy ELO 1300 pilot in a Cataphract + Heavy ELO 500 pilot in a Dragon adjusted to ) total 130 tons, 1700 ADJUSTED ELO
vs.
Team B: (Heavy ELO 500 pilot in a Cataphract + Heavy ELO 1300 pilot in a Dragon) total 130 tons, 1540 ADJUSTED ELO
 
Now that the matchmaker is actually making a better guess - it may realize that there is actually a closer match available.
 
 
 
 
 
 
 
Another thing is that people view 12-0 games as beling a blatant indicator of a matchmaker failure.
I just want to point out: I have been in a 12 man and wiped another team 12-0, and we get matched against them again in the very next match and it comes down to a 1v1 duel at the end. MWO is no respawn, failures cascade.
 
Also, in the group queue there is the additional concern that the "closest match" may not be that close at all.
 
 
 
 
 
 
 
And here comes the *
 
ELO has actually been bouncing around in my head a lot in the last few days.
 
I signed up for the last tournament, with no intention of trying to win. I had a lot of stuff to do over the weekend, but I signed up just for the chance to earn 10million cbills.
Getting 20 qualifying wins actually took me quite some time. Well, I was toying around in my Cicada X5 which is by no means a mdium class tournament winner... and I tried some CTF-4X builds thatI wanted to test... jumped over to a Dire for a few games and then played a decent amount in a Timberwolf just so I could be sure to get my 20 qualifying games.
It wasn't terribly difficult to get qualifying wins but it took maybe 10 total hours of gaming over the weekend... I had a lot of good games that ended in a loss...
 
 
 
On monday night I decided to try and use the tourney to grind some cbills to spend on a blood officer account (which had a total of 0 games so far) FYI, if you see a blood officer account, they are basically used by clans that conduct anonymous trial duels, etc. because MWO does not allow you to hide names, even in private lobbies. There are actually a lot of these floating around... I know because the first one I tried to sign up for was actually taken! (And no, it wasnt something obvious like Blood Officer 007)
 
Here is a timeline of events for that night as best as I can recall:
Game 1: Trial Nova - WIN - see a bunch of players I never heard of. Get the Hat Trick Acheivement.
Game 2: Trial Nova - WIN - see a bunch of players I never heard of
Game 3: Trial Nova - WIN - see a bunch of players I never heard of
Game 4: Trial Nova - WIN - see a bunch of players I never heard of. I actually ot blowed up on this one. Spectate an Ember pilot who doesn't seem to know how to use his arms. He machine guns a kitfox, but he's lasering the dirt. Again. And again. LOL.
Game 5: Trial Nova - WIN - see maybe a few players I have seen with my main account, but no-one of 'legenday' status
Game 6: Trial Nova - WIN - see maybe a few players I have seen with my main account, but no-one of 'legenday' status
Game 7: Trial Nova - WIN - see maybe a few players I have seen with my main account, but no-one of 'legenday' status
Game 8: Trial Nova - WIN - see maybe a few players I have seen with my main account, but no-one of 'legenday' status
Game 9: Trial Nova - WIN - see maybe a few players I have seen with my main account, but no-one of 'legenday' status
Game 10: Trial Nova - WIN - see a player who actually got 1st in this tourney
Game 11: Trial Nova - WIN - by this point I have gotten 7 qualifying wins, see first Lord
Game 12: Trial Nova - I think this was the first loss. Abandon my attempt to get knight errant (25 games) with all wins. Resume BANZAI style of gameplay. (True story about Fire and Salt... if the team doesn't push, I get bored and go and die alone. Save 5 minute sandoffs for league play. Yea, I'm undisiplied, sue me.)
...
Game ~15: See a few players that are epicly skilled. See first SJR. Feel a lot better about the guy I saw in game 10. Acually, the playerset looks like one that I may actually see with my main account.
Game ~16: More Noobs - OK, so I clearly don't have as high of an ELO as I do on my main account, but I atleast have as high of an ELO as someone I might actually see from my main account.
...
Game ~18: Get the bad company acheivement. Yep, still facing occasional teams of noobs who can't aim, but I am now occasionally seeing high ELO players (Who are probably thinking: oh, look, scrubs! upon looking at the player list. Scrubs being relative, of course.)
...
Game ~21: Trial Nova - WIN - by this point I have gotten around 10 qualifying wins, and my total losses are now up to 3 or 4
Game ~22: Buy A Nova
Game ~25: Pimp Nova out, at this point I probably have 11ish qualifying wins.
...
Game ~27: Forget that I switched back to the trial nova. Alphastrike on my very first shot of the game. Die before powering up with about 75 damage (which was all done to a summoners CT, who was noobishly standing still. But who am I to judge LOL.)
...
Game ~29: Remember mid-game that I am aupposed to use an ERLL to farm assists
Game ~30: Qualifying win. Use ERLL to farm assists. 2k 10a or something like that.
Game ~31: Qualifying win. Use ERLL to farm assists.
...
Game ~33: Qualifying win. Use ERLL to farm assists.
...
Game ~35: Qualifying win. Use ERLL to farm assists. Realize that I cracked the top 100 in mediums in slightly less than 3 hours of gameplay. Apologies to anyone I knocked down a slot... Vow to complain about ELOs impact on tournaments. Decide that I will start farming cbills with a different mech if I get to close to the top 15. Don't want to knock someone off the bord who isn't in easy mode.
Game ~35: Make noob mistake. Get pwned. Screw it, go to bed, feeling just a little dirty. Ended up with 15 qualifying wins... mostly in a trial mech. Finished in 65th place for a measly 3:00 of gameplay (in game time) with a score of just over 1900
 
 
 
 
 
My conclusion is that ELO actually converges reasonably quickly. For the case of tourneys, however, it is unreasonably slow. I knew that tourneys were slightly unfair to high ELO players, but I didn't realize it was so blatant.
 
For the sake of tourneys, all players that opted in need to get thrown in one big bin together. Form the matches such that the best players are mixed in with the noobs as far as the 24 total players go and THEN try to balance by ELO.
 
Basically what we have now is:
Total ELO 2000 team vs Total ELO 2000, noobs all around
and
Total ELO 8000 team vs Total ELO 8000, vets all around
This is BAD
 
For tournaments those same 48 players need to be suffled so that its:
Total ELO 5000 team vs Total ELO 5000, each team is a mix of noobs alongside vets
and
Total ELO 5000 team vs Total ELO 5000, each team is a mix of noobs alongside vets
This is GOOD
 
I think it would be WAAAAY to easy to use a new account to win a tourney. Especially if you bought one of the mastery packs and a small cbill package. Imagine starting that first game with a fully kitted out machine with a MC arty strike (so as to be equal to the cbill one that requires grinding)
 
Also, I tried to win games even when I knew I wasnt going to get a good match score. I won my first 11 games, but surely I could have slowed my ELO progression if I wanted to be sheisty, by deliberately doing poorly as soon as I realized I wasn't going to output a tournament winning game.
 
 
 
 
 
 
 
 
Tournaments aside, ELO seems to work quite well. I think the matchmaker fails are frequently cascading failures due to a no respawn game, as well as simple cases of not enough suitable players available at a given time. ELO is not to blame as much as people think it is.