Jump to content

Please Implement Elo Or Trueskill Matchmaking


184 replies to this topic

#161 A Headless Chicken

    Member

  • PipPipPipPipPipPip
  • The Hungry
  • The Hungry
  • 273 posts
  • LocationImmersed in Stupid.

Posted 04 December 2017 - 08:03 PM

You made a terrible assumption: that I had to think critically to talk to you.

EDIT: It seems that everything to you in 'answering nothing' and 'dodging', both here and your other posts. Don't think you'd realize if a Gauss projectile hit you in the empty space between your ears.

Edited by A Headless Chicken, 04 December 2017 - 08:13 PM.


#162 WarHippy

    Member

  • PipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 3,836 posts

Posted 04 December 2017 - 08:15 PM

View PostWintersdark, on 02 December 2017 - 01:57 PM, said:

No, it doesn't. As someone who played a LOT in 8v8, and particularly around the transition time to 12v12:
I played a lot of it as well and it really wasn't any different.

View PostWintersdark, on 02 December 2017 - 01:57 PM, said:

Early complaints by the "Stay at 8v8" crowd where, in particular, that 12v12 leads to more stomps and is harder on new players. I argued against this, and was wrong. See, what happens - what they said would happen - is that small mistakes are actually harder on you. Peek around a corner into the OpFor? Now there's 50% more mechs firing at you while you try to backpedal. In 8v8, you just take less punishment when you make a mistake. New players make lots of mistakes, but they tend to survive them more.
That really isn't true. While there are potentially 50% more mechs firing at you there are also 50% more potential targets on your team for the enemy to shoot at instead of you. Mistakes are costly period regardless of the number of players in a game where a single mech can take out another in a few seconds/volleys. If two mechs or 10 are firing at you because you made a mistake the end result is the same more often than not.

View PostWintersdark, on 02 December 2017 - 01:57 PM, said:

Yes, each player is a larger part of the whole team - each player "matters" more - in 8v8, but as good and poor players tend to be fairly evenly distributed, overall team effectiveness is comparable either way. Just that in 8v8, mistakes are less immediately lethal.
Pros and cons for both versions when it comes to the impact of a lost player. Not really a convincing argument for either camp.

View PostWintersdark, on 02 December 2017 - 01:57 PM, said:

This is what they warned would happen, and it is exactly what happened. You can even look back in the forum threads from that time and see it spelled out verbatim. Along with my own posts of "but in 12v12 each player matters less, so a poor player dying tanks his team less" - except more people die due to single mistakes in 12v12 than 8v8, and it's death that starts the snowball effect leading to stomps.
That was the case in 8v8 and 12v12 it really isn'y any different.

View PostWintersdark, on 02 December 2017 - 01:57 PM, said:

See, this is due to thresholds, how much damage a mech can do in a brief period of time vs how much damage a mech can take. The reality is that you're basically never exposed to an entire team at once, but in 12v12 you're going to take fire from more mechs simultaneously. When this fire in a single oops costs you just armor, you still fight with full effectiveness (8v8). When it costs you armor, structure and weapons, suddenly you're less effective... or just dead (12v12). 12v12 essentially means any time you're taking damage, you're taking 50% more.
Already addressed but it needs to be repeated because everyone seems to forget it. Your team has 50% targets/firepower more as well. It is really all a wash in the end.

I really don't care if we go back to 8v8, but people need to stop thinking its going to change anything because it will not.

#163 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 04 December 2017 - 11:38 PM

View PostA Headless Chicken, on 04 December 2017 - 06:21 PM, said:

More importantly, what makes you so sure that the people who are "potatoes, "good" and "legendary" will not end up in the same game after implementation of tr00sk1ll?


I have been asking this with no solid answer, and all I keep hearing is TrUeSkIlL wIlL sOlVe EvErYtHInG.


There's 2 facets to this

1. Population available

You'll see the same names because your tiers are XP bars and not related to actual player skill. An Elo system will move people both up and down. You will, based on population, still likely see a lot of the same names at your playtime though. However an Elo style system breaks the need for tiers while still providing better balanced matches with the same resources (which in this case is the players).

2. Division of population among teams

It's not just who's available but how they are split between teams. By identifying players direct contribution to their teams win/loss you better split people up into more even teams. The more matches they play the more accurate they get.

If we move back to an 8 v 8 environment and have a player Elo, mech Elo and some sort of loadout modifier you'll get even better balanced matches as me in a laservomit MAD IIC isn't viewed the same as me in a 3xAC10 Victor, which I really am not the same contribution.

That help?

#164 Brain Cancer

    Member

  • PipPipPipPipPipPipPipPipPip
  • The 1 Percent
  • 3,851 posts

Posted 05 December 2017 - 12:34 AM

View PostKhobai, on 04 December 2017 - 07:48 PM, said:


LRMs are stupid mostly because of the lack of destructible terrain


Sad note here: I was on Incursion, and some poor Locust hid behind a blast wall.

And then my missiles plowed into the wall for a bit, blew it apart on the second salvo, and killed him.

The sad thing is I can't do that on any other map but Incursion ones. Honestly, destructable terrain should have been built into the game from beta. Walls, buildings, rocks, trees, the whole nine yards.

#165 UnofficialOperator

    Member

  • PipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 1,493 posts
  • LocationIn your head

Posted 05 December 2017 - 01:12 AM

View PostA Headless Chicken, on 04 December 2017 - 08:03 PM, said:

You made a terrible assumption: that I had to think critically to talk to you.

EDIT: It seems that everything to you in 'answering nothing' and 'dodging', both here and your other posts. Don't think you'd realize if a Gauss projectile hit you in the empty space between your ears.


Shrug, you sound like some sjw hipster **** anyway

#166 A Headless Chicken

    Member

  • PipPipPipPipPipPip
  • The Hungry
  • The Hungry
  • 273 posts
  • LocationImmersed in Stupid.

Posted 05 December 2017 - 08:51 AM

View PostUnofficialOperator, on 05 December 2017 - 01:12 AM, said:


Shrug, you sound like some sjw hipster **** anyway


I find it ironic that a person who cannot understand some snark and sarcasm, who instead chooses to take offense at general opinions comments made, resort to name calling, and hand-waving arguments thrown at him has the gall to label another a "sjw hipster ****".

EDIT: I probably checked all the points off this little list with your mature behavior.

Posted Image

I find the irony rich seeing you begging all over the forums for attention just like a Tumblr fanboy.

It really doesn't help that you cannot seem to comprehend that you're not the smartest person in the world.

Edited by A Headless Chicken, 05 December 2017 - 09:11 AM.


#167 Wintersdark

    Member

  • PipPipPipPipPipPipPipPipPipPipPip
  • 13,375 posts
  • Google+: Link
  • Twitter: Link
  • LocationCalgary, AB

Posted 05 December 2017 - 09:03 AM

Lets assume 2000 players online at a time. This seems like a fairly reasonable/generous amount, given Steam often showing ~1200 players online. It's a pretty nice round number, though, so it's easy to work with, and it's very unlikely to be low - if anything, it's probably overly generous.

Now, lets pretend all of them are in Quickplay Solo Queue. They're not - some are playing Faction Play, others are in the Group queue. But lets pretend they're all in the Solo Queue. Note that again we're being very generous here: the actual number of Solo Queue players is much, much smaller than 2000 concurrent players.

Some more assumptions - what I've found to be very reasonable over many years of play -

Average quickplay match length: 7.5 minutes.
Down time between matches: 1.5 minutes (including loading times).
Matchmaking time: 1 minute.

Note: This means we're assuming people are playing a full match, counting downtime between matches, in rapid fire succession, one match per 10 minute period. That's pretty hardcore play. We're assuming ALL 2000 players are ALL in quick play solo queue ALL banging off match after match. Clearly ;extremely generous.

So, in this highly generous example, players are spending 10% of their time in the matchmaking queue. Thus, at any given time, the matchmaker has 200 players to choose from. That's just 8 matches of 24 players per match.

Even at this level, that's not nearly enough players to build what people would consider balanced matches. Let's assume we've got a Perfect Skill Tiering System, that accurately ranks players by actual skill into 5 tiers. You've got 200 players to choose from, split them into 5 tiers. If there's an even distribution, that's 40 players per tier - not even two matches keeping players inside one tier.

And again, that's with grossly over-estimated numbers.

And we've not even considered mech tonnage, weight classes.

If it turns out there's 1000 players rapid-fire playing Solo Queue matches, that takes us to a pool of 100 players total, of all skill levels and all weight classes, to build matches with at any given time; 20 players of any given 1/5th of the skill range (not even one match worth).




This is why the matchmaking method used just doesn't really matter. We may as well have flat out random matchmaking. PSR is a silly XP bar, but it does serve to at least keep experienced players and brand new players apart - if nothing else than that - and otherwise is essentially just random.

The above is all so "perfect world" too. There's non-peak times, times where people are actually playing Faction Play, others in the group queue, etc. Players rarely bang off match after match after match, and take small breaks between matches either to change mechs, builds, go to the bathroom, chat with friends, etc.

Yet even assuming EVERYONE is in the solo queue banging off match after match with nearly double peak steam numbers, there's still not enough players to reliably make good matches assuming the matchmaker is awesome.

#168 Brain Cancer

    Member

  • PipPipPipPipPipPipPipPipPip
  • The 1 Percent
  • 3,851 posts

Posted 05 December 2017 - 10:28 AM

Even if the MM is squeezing poorer players into matches, this should be tracked.

And compensated for. If a player outside the proper range is being squeezed into a game with much higher "tiers", at the least there should be a bonus for xp/Cbills for the fact that he's being placed into a game the system doesn't think he's good enough to be in to begin with. And a lower PSR loss/higher PSR gain for doing poorly/well, considering he's a seal in a shark tank.

#169 Xavori

    Member

  • PipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 792 posts

Posted 05 December 2017 - 11:56 AM

Wintersdark,

You can get balanced matches (meaning 50% chance for either team to win or lose) with around 50-60 players being available to choose from. This is history speaking. If you go to your example of 200 people, it's easy unless it's 1 great player and 199 potatoes. Then you're going to get a lot of 50% matches, and 1 match that's totally broken. So still way better than the current system.

I think the people who are against having better player ratings are demanding perfect, and anything less than that means don't bother doing anything. That's a terrible proposition.

I don't think for even a moment that ELO ratings for MWO will make perfect matches. Instead, I think it will have a much better chance of producing quality matches meaning that both teams have ~50% chance to win, even tho most of the time the odds will more likely be 55% to 45%. I'm okay with that because it's so very much better than what we have now where I can often look at the two teams before the match starts and already know which team is going to win (especially in group queue which is where the players who truly care about their leaderboard stats live).

I want a game where matches are competitive. I want to know that my team has a decent chance to win, and that every time I drop, I have that same decent chance because the matchmaker is never going to load my team with potatoes and the other team with comp players.

As for building a per-mech ELO for players. Sure. Why not? Computers are fantastic for record-keeping and data processing, and that's really all ELO is, so even if every player ends up with 100 ratings which would be way too much for a human running things, the computer won't care at all. Heck, it can even start doing fun things like taking your per-mech ELO, and a per-weight class ELO, and your overall ELO and come up with a much more meaningful overall player rating versus the rest of the pilots in MWO. And I'd be totally cool with that.

#170 vandalhooch

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 891 posts

Posted 05 December 2017 - 09:45 PM

View PostXavori, on 03 December 2017 - 09:36 PM, said:

vandalhooch,

Obviously you don't understand how ELO, WHR, TrueSkill, etc. work. That's the difficulty we're having. Google them. Read up on them. All of your arguments against my suggestion seem to be born in ignorance, and ignorance is easily correctable.


Starting to realize that you are all bluster. I actually used the TrueSkill equations to calculate the minimum matches necessary to "accurately" rate a player's skill in MWO.

I've yet to see you do anything beyond making vague claims about what these systems are supposed to be capable of doing.

Quote

The short version, you absolutely can use ELO-like rating systems in team games that will ultimately end up giving a rating that allows for comparison between individual skills. The evidence in support of this is overwhelming as all kinds of team based games do exactly that and have been since even before computer games.


Going to need you to actually cite some of that support.

Note: Using an Elo-ish system to rank individuals relative to one another is not the same thing as measuring an individual's skills. A relative system can not give you absolute data.

Quote

Actually, you can't use random teams. If you use random teams, you get random results, and random numbers are random.

Here's what could happen if you had random teams. And for the purposes of this example, I'm going to use player skills of just 1-10 with 1 being mashed potato and 10 being l33tzorz. Let's say you had 8 players with ratings: 10,10,8,7,3,2,2,1. Now let's say the matchmaker randomly made these two teams:
Team A: 10,10,8,7
Team B: 3,2,2,1
Exactly what do you think is going to happen? And how then do you think you're getting any new meaningful data with which to adjust player ratings based on this massacre?


How about these two teams -
A: 10, 10, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1
B: 5, 5, 5, 5, 4, 4, 2, 1, 1, 1, 1, 1

Does that look like an even matchup? Because according to your and TrueSkill's assumptions this should end up in an even match. Do you even play this game at all? Do you really not see that Team A is going to get decimated every time?

Quote

It doesn't matter how likely such a scenario is. It just matter that it's possible. It's why I keep calling our current leaderboard garbage. A W/L ratio above 1 only means that that player has played on winning teams more often than losing teams. You cannot read anything else into it because you cannot isolate down to just that one player's skill being responsible.


If you would have read my writing a little more closely you would see that I at no time defended the current leaderboard nor did I ever claim that the leaderboard is in the least bit useful.

Quote

Yes, it's possible that in the current environment that a W/L ratio above 1 means you are a good pilot, but it's not anywhere near a certainty. You could be a bad pilot who just happens to only log on when even worse players are online. You could be getting carried by your teammates (because solo and group aren't split on the leaderboards). You could just be lucky in that you end up with good teammates in solo queue more often than not. It doesn't matter how likely any of those possibilities are. All that matters for rendering the leaderboard data meaningless is that there are so very many possibilities other than just "good pilot".

Conversely, if you have a matchmaker that does not build random, but instead tries to match team skill ratings, you'd get:
Team A 10, 8, 2, 2 (22)
Team B 10, 7, 3, 1 (21)
Now you have a quality match.


You will also get 10, 10, 1, 1 vs 7, 5, 5, 5. Is that a quality match?

Quote

Team A has a very slight advantage which will end up reflected in the adjustments, but it's close enough that both teams have a good chance to win, and hence, you can expect that the winning team should move up in ratings. Team A would move up say 3 points with a win or 4 points down with a loss, and Team B would move up 4 points with a win and only 3 points down with a loss (these are totally made up numbers just to illustrate the point...badly since you don't want negative numbers in player ratings *smirk*) that reflect the small advantage Team A had. After the match, you'd adjust everyone's ratings immediately so that next time the match will reflect the most up to date rating possible. Repeat this 40-50 times, and you'll effectively shuffle the individuals within the teams to a rating that likely reflects their comparable skill (or 90-100 for 8v8, or hundreds for 12v12)


I am more and more convinced that you don't actually understand how those systems actually work. Every example you pull up for a possible match is an idealized version. You completely ignore the fact that those systems don't try to match players one for one on each team. They try to match team totals, which is a completely different process and will not result in "good matches" all of the time.

Quote

So the goal of a better rating system is to produce better matches. This in turn has the effect of giving us a much better indication of who are the good, average, and still learning pilots. It won't be perfect because MWO has so many moving pieces and variables, but it'll be light years beyond what we have now.


Still waiting for you to show some sort of support for your claim that stomps happen in 80-90% of matches with the current system.

I'd also like to see you explain how your dream system will actually create your idealized matches from a limited pool of players available to be slotted at any given moment.

Team A: 8, 8, 5, 3
Team B: 7, 7, 6, ?

How long is the matchmaker supposed to wait around for a 4 to hit launch? Oh, and need to make sure that that 4 is piloting the correct mech class. If the only pilots available and in the right mech class are all 9's, is that close enough to launch? You do know that TrueSkill systems would actually slot in that 9 and launch creating an unbalanced match, right? You do know that that is how they actually work, right? What the TrueSkill system will do is limit how much the ratings for Team B would adjust if they win and increase the movement of Team A if they win. But here's the thing you don't seem to grasp . . . that unbalanced match will absolutely, positively be launched.

A limited available player pool is going to end up with a larger proportion of unbalanced teams regardless of what system you use to rate the players. Always!

#171 vandalhooch

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 891 posts

Posted 05 December 2017 - 09:54 PM

View PostXavori, on 03 December 2017 - 10:58 PM, said:


It doesn't matter if you have ever been convinced. You are not reality. Your opinion doesn't supercede history.

Reality is on my side. There are any number of team games that use individual rankings to make the team that show that it really does work.

A great example is a small dart league I used to play in. The total player base for the league was about 60 players. The ratings when from 0-maybe hits the board with all three darts to 20-will lhit 501 in 12 darts or less every single time they play. We had 4 player teams, and each team was assembled with a 30 point cap and only one 18+ player allowed per team. Every team was competitive, and at the end of each season, the ratings were redone based on performance. The truly ironic part was that even at the start of the league, the team whose 0's played the best (because this league had so many 0's every team had one) were the teams that won, and by the end of the season, the 0's that improved the most basically won the end of season tournaments.

And it's not just darts where this works. Practically every game that uses individual ratings within teams does it the way I described precisely because it's been shown to work. The math has been done and redone and refined for decades. When Microsoft put together their TrueSkill system (or really any of the quality matchmakers out there), they weren't making something totally new. They were simply taking existing work and ideas and putting them together in a way that best fit the kinds of matches they were handling.

So again, it doesn't matter if you are convinced or not. Reality has shown that it works.


Oh my gosh, you really don't understand what happened in your league at all do you?

#172 vandalhooch

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 891 posts

Posted 05 December 2017 - 10:02 PM

View PostXavori, on 04 December 2017 - 12:19 PM, said:

Trust me. This exact type of matchmaking and team building predates computer games, and we know it works. I'm not suggesting anything earth shattering or amazingly new. I'm suggested using the same old system that people have been using since they first started trying to put together teams of differently skilled players and still have them play competitive matches.

You are a broken record You don't seem to know how to do any of the calculations yourself. You don't know or understand the underlying assumptions of the different systems. And you refuse to acknowledge that MWO in its current state violates those underlying assumptions. Violating the assumptions prevents the system from actually delivering the results you keep claiming would come. Any of those systems would give us exactly the same type of matches we get already.

#173 vandalhooch

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 891 posts

Posted 05 December 2017 - 10:27 PM

View PostMischiefSC, on 04 December 2017 - 02:55 PM, said:


You absolutely can and do have matches both in Elo and TrueSkill that result in a win with 0 change. In fact that's usually the product for both sides when the match plays exactly as predicted when it's unable to make a balanced match.


Wait, Xavori keeps trying to tell me and others that a TrueSkill system would not make unbalanced matches. How could that be?

I dug around a little deeper into the TrueSkill system and found that you are correct. Some unbalanced matches could result in no adjustment made a player's rating. But note, that's only for some, extremely unbalanced matches. Your more run-of-the mill mismatches will result in small, adjustments.

1 - Do all matches end up in movement like I stated? No.

2 - Is grinding up your rating by playing at certain times of the day still possible? Yes.

Quote

A variable of about 400 pts between the averages will usually result in a 0 win/loss if the match goes as predicted. It's going to depend on what you use for the K factor (which both have, though the TrueSkill one is far more complicated) but both of them come to 0 on a wide enough spread. So a team with a 1400 average vs a team with a 1,000 average, the 1400 team wins the players will get, depending on K factor, 1 or 0 points and the losing team changes 0 points.

If you don't get that the leaderboards showing consistent stats for each player month after month means anything than you need an education on math and statistics that's beyond the scope of what I can do on the forums for you other than give you google links to both.


Downright hilarious considering what I do for a living.

Note - you have now moved your goalposts from claiming that the leaderboard results mean a specific thing to leaderboard results mean some vague thing.

Quote

Also Law of Averages, Law of Large Numbers. Then again if you were willing to do the research to understand what you're talking about we wouldn't be having this discussion so it comes to reason that you won't, you'll just continue to argue erroneous points and then keep refusing to actually take the steps to understand what you're discussing.


Bwaaaaa, haaaaa, haaaaa.

Quote

It's not a 'proxy' measure of skill. It's a performance average in the same environment with the same options available.


It most certainly is not performance in the "same environment." The available player pool to draw opponents and teammates from is not the same through time. The variables (rules of the game) are not the same through time (new mechs, balance passes, new maps . . . ).

Or are you really going to try and claim that someone who earned a 3.0 W/L ratio in season 1 played in exactly the same environment with exactly the same options as someone who earned that same 3.0 ratio in season 17? Is that really your claim?

Quote

So you're taking the average performance of the team and generating an average for the team. Ideally you want each team to be composed of players within 150 to 200 pts (on an Elo scale) of each other but periodically you'll end up with unbalanced teams - which will, absolutely, affect the K factor for each player for the match. In part that facet of how your scores change on matches that are outside of the desired range is what TrueSkill does that makes it special.


Ideally . . . care to explain in detail what those ideal conditions are for these systems to be able to produce the results you want?

Care to explain how MWO actually meets all of those ideal conditions?

Care to explain what happens to these systems when ideal conditions are consistently not met?

Quote

You are 8.333% of your team, every single game you play. The only thing having group queue in the mix does is skew results out of scope UPWARD on the top of the scale and for people who play a mix of group and pug queue (which is a minority of players, per PGI the last time they gave results it was less than 6% of the player population.


Is 6% of the player population the same as 6% of the matches played? What proportion matches does a typical player in that 6% pool play in each mode?

Does your complete inability to answer those questions give you the slightest reason to maybe rethink your original claim?

Quote

Less than those who play FW even) it just increases the number of samples required to give accurate results. Your W/L averaged over 3 months at over 100 matches a month, for example, would account for group queue play that's less than 50% of the players time.


How many of those players on the top of the leaderboard play less than 50% of their time in group queue? How do you know?

Since you can't possibly know if that assumption is true, why should I even bother to read this drivel?

Quote

If/when we do get group/pug queue split the only thing it's going to do is shrink the numbers, not the players, on the top few pages and shift a small percentage of players (less than 10%, again, most never touch group queue) by a few percentage points spread over the results.


And again, you can't possibly know that. Six percent of players is not the same thing as 6% of matches.

BTW: Exactly which metric were you using to determine "top of the leaderboard?"

Quote

Again. It's 0 sum. Not hard to figure each players value.

Ideally it wants to build teams within a range. Such matches would give a better K factor (how much you change scores by based on win/loss) than matches with a high-low to average makeup but it's still going to work out. Just takes more matches to get players seated correctly.


And neither of you "experts" bothered to even attempt to calculate how many matches would be needed to get players seated correctly.

I did.

#174 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 06 December 2017 - 09:32 AM

Except that I went over all of this, with total mat he's requires, twice in two prior threads. The seating Trueskill does is ladder based - we don't need that. We're not trying to ladder rank, just seat within a reasonable margin of error.

You keep trying to equate a margin of error with being totally unreliable. You're also trying to pretend that since some players play group queue a lot that everyone's stats are meaningless - save that for the vast majority of players the handful of players playing each other in group queue are irrelevant.

Small point changes up and down are how the system balances. Law of Large Numbers. The random swings of chance are equal for everyone and balance out. I thought we had covered that.

It's absolutely possible that a literal handful of players who are actually mediocre performers are getting consistently carried by top performers in group queue and their personal stats are inflated. We're talking a fraction of a fraction of a percent out of 40k players in any season. You're going to have way more disparity caused by alt accounts.

Go look yourself up, track your stats for 4 seasons. Then pick people at random from the same page you're on and do the same for thrm. See how it's all within a narrow range?

If you want the math broken out for you I'll take the time to.orrow or the day after. Having has this discussion a hundred times though I'm not convinced it will make any difference. I get that you want to pretend that the mix of group queue and QP stats is a big deal - it's not. It only impacts the people in group queue and because, again, it's all zero sum. So all that's off for almost all players in group queue is scale, not direction. So if/when group queue gets removed it'll just pull the gap down not reshuffle completely. Matches played in group vs pug is largely irrelevant in tgat wins for the players in geoup queue dont impact anyine save others in group queue, save in that the largee total QP matches played make group queue even less impactful.

Players change in performance all the time. I've said that repeatedly. They develop new skills or they pick up bad habits or their investment slips or they take a break and lose their edge or whatever. However this all impacts their win loss, which is correct for predicting their average performance. Stats within a season are relative as it is the same, as are balance passes. Again, zero sum. So whoever adapts to the changes better will go up which inherently pushed others down.

Please provide any evidence in the leaderboard stats of inconsistent swings in a statistically relevant population sample. The only arguments you've made so far are disingenuous questions and implications that because the results are not exact and have a margin of error as all human related metrics do that they're meaningless.

Again, look at the leaderboard satay. Its pretty consistent when you get off the obvious outliers, which any serious analytics would do amyway. Any system you create will get inflation on the top and bottom as populations thin and atypical examples will skew the end results up or down regardless.

#175 Xavori

    Member

  • PipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 792 posts

Posted 06 December 2017 - 10:51 AM

vandalhooch,

I'm going to skip most of your stuff (might get back to it later) and just point out the problem in your matchmaker example.

Quote

How about these two teams -
A: 10, 10, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1
B: 5, 5, 5, 5, 4, 4, 2, 1, 1, 1, 1, 1


An ELO based matchmaker would never make those two teams. I explained this.

If you take 10,10,5,5,5,5,4,4,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1 (your player list) and put them into teams using standard rating algorithms, you get:

Team A: 10, 5, 5, 4, 2, 2, 2, 1, 1, 1, 1, 1
Team B: 10, 5, 5, 4, 2, 2, 2, 1, 1, 1, 1, 1

In other words, your example would have produced a perfectly balanced match (which makes it a terrible example because that's not going to happen very often)

The algorithm takes the best player and puts them on Team A, then takes the worst player and also puts them on Team A.
Team A: 10, 1
Then it takes the second best player and second worst player and puts them on Team B
Team B: 10, 1
Next it compares the the two teams to see who gets next pick. Since it's 11 to 11, it defaults to Team A getting the 3rd best player and team B getting the 4th.
Team A: 10, 5, 1
Team B: 10, 5, 1
Again, it compares the scores and picks worst players by higher score...and you can see how this is going to work out now, right? Just alternating between best and worst available by which team needs to boost or reduce score is straightforward, fast, and produces the best quality matches you can hope for.

And this isn't even the only algorithm available that will do this. But the short version reply to your concern is that it won't happen if PGI uses any standard player rating based algorithm for matchmaking.

#176 Xmith

    Member

  • PipPipPipPipPipPipPipPip
  • The Ironclad
  • The Ironclad
  • 1,101 posts
  • LocationUSA

Posted 06 December 2017 - 05:24 PM

View PostXavori, on 29 November 2017 - 09:48 AM, said:

I think I can safely say, with little disagreement, that our current matchmaker sucks. There are far too many 12-0 matches, and not enough 12-11 matches. It doesn't have to be this way.

Even with a smaller player population, it's entirely possible to create matches that are more likely to produce balanced gameplay. I've been in a number of long-running small dart leagues that had teams with players who had skill from 6-7 dart outs to lucky to hit the board with all 3 darts. But because teams had balanced ratings, the matches still were competitive.

MWO could have that. There are many options available for creating player ratings, and then using those ratings to assemble roughly equivalent teams. This would dramatically improve the quality of matches, and should make the game more enjoyable as well.

This is the kind of talk you would get from casual spectators of sporting events. They may have know idea what it may take to win whatever sporting event they may be viewing. They are clueless to the strategies involved and don't realize exactly what could happen if it fails. They think a close match is a better match. My competitive nature begs to differ.

Real competitors always want to stomp the opponent every chance they get. I like 12-0 matches. I have no problem beating down the red team.

I also have been on the losing end to a 12-0 stomp. Most of the the time, I'm not bothered by my team losing 12-0. I realize it's part of the game. Getting pissed about it is unproductive. It's best not to repeat what ever mistake that I most likely did to hurt the team's chances in winning the match. Any good player will do same.

Now, I too have been to Jarl's. I checked a few players stats from several recorded matches. Honestly, it's about right. Everyone for the most part are where they should be. It's probably true for the other tiers as well.

Edited by Xmith, 06 December 2017 - 05:31 PM.


#177 Khobai

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • Elite Founder
  • 23,969 posts

Posted 06 December 2017 - 05:33 PM

ELO is for 1v1 games or organized teams vs organized teams. It shouldnt be used for individuals randomly getting thrown into group games. ELO is silly and pointless for quickplay.

if you have a player list where player skill follows a standard bellcurve

for example: 10, 9, 9, 8, 8, 7, 7, 7, 6, 6, 6, 6, 5, 5, 5, 5, 4, 4, 4, 3, 3, 2, 2, 1

and you randomly distribute those players between both teams

over thousands of games the skill level of both teams will average out to be roughly equal

matchmaker doesnt need to actively match skill levels as long as you draw from a large enough population of players that you get a good sampling of all the different skill levels.


what matchmaker DOES need to match more than anything is the tonnage and quality of mechs. because that doesnt follow any sort of predictable bell curve. I mean some mechs will always be more popular than others but the loadouts of mechs are so varied that its basically random what mechs people are going to be playing.

matchmaker needs some kindve rudimentary battle value system to determine how good or bad a mech is. And it needs to try and even out the quality of mechs on both teams.

Edited by Khobai, 06 December 2017 - 05:42 PM.


#178 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 06 December 2017 - 06:35 PM

View PostKhobai, on 06 December 2017 - 05:33 PM, said:

ELO is for 1v1 games or organized teams vs organized teams. It shouldnt be used for individuals randomly getting thrown into group games. ELO is silly and pointless for quickplay.

if you have a player list where player skill follows a standard bellcurve

for example: 10, 9, 9, 8, 8, 7, 7, 7, 6, 6, 6, 6, 5, 5, 5, 5, 4, 4, 4, 3, 3, 2, 2, 1

and you randomly distribute those players between both teams

over thousands of games the skill level of both teams will average out to be roughly equal

matchmaker doesnt need to actively match skill levels as long as you draw from a large enough population of players that you get a good sampling of all the different skill levels.


what matchmaker DOES need to match more than anything is the tonnage and quality of mechs. because that doesnt follow any sort of predictable bell curve. I mean some mechs will always be more popular than others but the loadouts of mechs are so varied that its basically random what mechs people are going to be playing.

matchmaker needs some kindve rudimentary battle value system to determine how good or bad a mech is. And it needs to try and even out the quality of mechs on both teams.


If there was only 24 players you might be right. However with 40k players you're constantly mixed with different teams and as such your value comes to the surface.

Again, go look at the leaderboard. Is everyone right at 1.0? Do score randomly swing from 10.0 to 0.01? No? That's because it pretty accurately reflects your skill relative to others and creates a consistent distribution.

You're 8.333% of your team every match and that value of that can be solved for.

#179 vandalhooch

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 891 posts

Posted 06 December 2017 - 07:10 PM

View PostXavori, on 06 December 2017 - 10:51 AM, said:

vandalhooch,

I'm going to skip most of your stuff (might get back to it later) and just point out the problem in your matchmaker example.


I seriously doubt you'll ever get back to any of it.

Quote

An ELO based matchmaker would never make those two teams. I explained this.

If you take 10,10,5,5,5,5,4,4,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1 (your player list) and put them into teams using standard rating algorithms, you get:

Team A: 10, 5, 5, 4, 2, 2, 2, 1, 1, 1, 1, 1
Team B: 10, 5, 5, 4, 2, 2, 2, 1, 1, 1, 1, 1


Mech classes. Oh, and you clearly didn't read what the TrueSkill description actually said.

Quote

In other words, your example would have produced a perfectly balanced match (which makes it a terrible example because that's not going to happen very often)


And here you go again, assuming everything is ideal. MECH CLASSES!

Quote

The algorithm takes the best player and puts them on Team A, then takes the worst player and also puts them on Team A.
Team A: 10, 1
Then it takes the second best player and second worst player and puts them on Team B
Team B: 10, 1
Next it compares the the two teams to see who gets next pick. Since it's 11 to 11, it defaults to Team A getting the 3rd best player and team B getting the 4th.
Team A: 10, 5, 1
Team B: 10, 5, 1
Again, it compares the scores and picks worst players by higher score...and you can see how this is going to work out now, right? Just alternating between best and worst available by which team needs to boost or reduce score is straightforward, fast, and produces the best quality matches you can hope for.


MECH CLASSES

Quote

And this isn't even the only algorithm available that will do this. But the short version reply to your concern is that it won't happen if PGI uses any standard player rating based algorithm for matchmaking.


MECH CLASSES

#180 Xavori

    Member

  • PipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 792 posts

Posted 06 December 2017 - 08:53 PM

View Postvandalhooch, on 06 December 2017 - 07:10 PM, said:


I seriously doubt you'll ever get back to any of it.


You're right. I won't. You're not worth the effort because you're being intentionally obtuse. I explained your example, so now you're screaming MECH CLASSES!.

They work the same way.





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users