Xavori, on 03 December 2017 - 03:19 PM, said:
vandalhooch,
In reading your various responses, you're either misreading what I'm saying, or you're trolling. In the interests of discussion, I'll assume positive intent that you're just misreading me.
When I said Elo-ish, it wasn't to say -ish as something weakened or watered down. ELO is a specific player rating system created for chess, and even there modified by the various foundations. When I say ELO-ish, I mean other ranking systems that work in a similar fashion.
Define "similar." How different does it have to be before you stop calling it Elo-ish?
Quote
You keep insisting that we somehow magically rate players' skill levels.
No, I most certainly did not. I keep pointing out that you claim your proposed system will create teams with matching "skill levels." I keep pointing out that you have not actually show how you are going to measure a player's skill level. It's you who keeps claiming that your system can do it.
Quote
I not once suggested any such thing. You are correct in that it'd be impossible simply because so many variables go into MWO piloting, and they change based on situations. So it'd be a fool's errand to try to create a number and call it "player skill". I've actually made the point about how arbitrary and useless such a number would be in a number of other threads...
So why do you keep claiming that your Elo-ish system will create matches with teams of equal skill level? All your system would do is create matches with teams with equal Elo-ish rankings. And as you yourself have pointed out, that's not a measurement of skill level.
Quote
Instead, what you look for in ELO-ish (by which I mean, ELO, WHR, TrueSkill, or some other rating system) is a way to match up players so that when two players have the same rating, you'd assume that they have a 50/50 chance of beating each other.
Which is perfectly fine for a 1 v 1 game or contest. MWO is not such a thing.
Quote
That's about as strong an approximation of equivalent skill as you are going to get, but it is important to remember that it's a comparative, not absolute, skill rating.
More importantly, your rating being adjusted due to the actions of other players in the match (teammates and opponents) makes your proxy metric the absolute same as what we have now.
Quote
Now, in head to head, it's pretty easy to rate players. You just keep having people with similar current ranks play each other with the winner going up in rating and the loser going down. The actual math for most rating systems gets a bit more complicated simply because they build in the quality of the match (ie. the difference in the player's actual rating, if any) in order to determine how much to move each of them. Repeat this often enough, and you get a rating that ultimately will lead to that 50/50 condition (or a W/L ratio of ~1).
With teams, it's a bit more work, and takes a lot more matches, but it ultimately functions the same way.
No it doesn't. I don't think you quite comprehend what "a lot more matches" actually means in regards to such a system. Minimum number of matches necessary for a reasonable alpha level increases exponentially as team sizes increase.
Quote
You just move all the winners up and all the losers down, and then reshuffle and have new teams of equivalent rating playing each other. And when I say it takes a bit more work, that's just in that the formulas tend to be a bit more complex and need quite a few more iterations before the rating is "solid." For example, using TrueSkill, it takes 12 head to head matches between individual players to get a solid rating. For 8v8 team play, though, it takes 91 matches.
MWO has 12v12, what's the minimum? Does that minimum require the same 24 players be reshuffled each match? How do you account for mech class in your matchmaking?
Quote
If we applied TrueSkill to MWO 12v12, it'd take hundreds of matches. But it would get there.
How many hundreds? Mech classes?
Edit: Since it seems like you don't want to do the math, I went ahead and did it for you.
Last season (17) saw 28,078 different players play at least ten matches.
According to the TrueSkill system, under ideal circumstances, that means each player must play 354 matches in order for the system to accurately place them in their appropriate rank, using the default 50 levels.
However, a small pool of comparable players to draw from compounded by the mech classes would create a non-idealized situation where many matches would not provide a bit for calculation. Their own data runs indicate that non-idealized situations typically increase the minimum match number necessary by a factor of 2-3.
So, to implement a TrueSkill system for a two-team, twelve-player per team game with 28,000 players will only require that every player play between 708 and 1,064 matches to be properly rated.
No problem. Sounds like a piece of cake. You've got me convinced . . . or not.
Quote
Pretty much any ranking system is going to face the same challenge in terms of needing a bunch of matches, but the great thing about computers is that they're totally cool doing the same math problem over and over and over again.
Now I personally would prefer team ELO or WHR to TrueSkill simply because TrueSkill only has 50 ranks, and so lacks the precision in both leaderboards (TrueSkill matchmaking almost always has other stats for their leaderboards because each rank will have tons of players) as well as matchmaking. It's a lot more likely to be a quality match when using a wider range of possible values because the ratings represent a smaller difference, ie. the best rank 50 TrueSkill player might be measurably more likely than 50% chance to win vs the worst rank 50, but a 2800 ELO player is almost certainly at 50% vs another 2800 ELO, but just above 50% against a 2799 ELO. And if you want to get way bogged down in complicated math. WHR rankings tend to be really good at prediction because they aren't incremental (ie. moving up and down each time), but instead recalculate the rating based on the entire history of the player.
Group vs solo queue? How's your calculation going to handle that?
Note that non-random teams increases the minimum match number to properly rate players.
Quote
But the key point here is that the goal is to maximize the number of quality matches. To do that, you have to get teams put together that have similar total skill. Since you can't really assign an absolute number to MWO pilot skill (or lots of other games for that matter), you instead rely on a comparative skill where the assumption is that two players with the same rating have a 50/50 chance of beating each other,
You don't face one opponent at a time.
Quote
and then you build teams where the combined skill level is as close as equal as you can get (because we don't want people waiting for days for perfectly equal matches) using the current queued players.
So, two aces and ten rookies vs twelve average players is a "good match" in your mind?
Edit: It is in Microsoft's view.
But how does the TrueSkill ranking system incorporate the game outcome of a team match? In this case, the team’s skill is assumed to be the sum of the skills of the players.
That's the problem with calculating a proxy value for a player's skill based on their individual history in the game. You can't simply add up the various teammate's values to make up a team value.
The average of several averages is a useless metric.
Quote
That is definitely something that could be implemented, and it wouldn't even be that complicated because all the theory and math and formulas for doing it are already fully developed.
And they create the same level of "matchmaker broken" whining that we already get in MWO.
BTW: I'm still waiting for you to show your evidence that 80-90% of matches result in stomps, necessitating this overhaul.
Edited by vandalhooch, 03 December 2017 - 08:27 PM.