Jump to content

Elo Worthless


298 replies to this topic

#161 SgtKinCaiD

    Member

  • PipPipPipPipPipPipPipPip
  • Overlord
  • Overlord
  • 1,096 posts
  • LocationBordeaux

Posted 15 November 2013 - 05:31 AM

That's another flaw : ELO is only based on your Win/Loss ratio. I think it should be based on your performance ingame aka your match score : your score is higher than your team average score, you gain some ELO, and vice versa, regardless of win/loss.

Actually, a player doing nothing in a match can still win some ELO thanks to his team and on the contrary a player doing lots of kills/assists/damage can still lose some ELO because of its team going full ******.

#162 -Muta-

    Member

  • PipPipPipPipPipPipPip
  • The 1 Percent
  • 749 posts
  • Locationstill remains a mistery.

Posted 15 November 2013 - 05:36 AM

View PostMalino, on 11 November 2013 - 02:47 PM, said:

Hi,

Been playing a while and now we have more variety in mechs I'm seeing more and more games worthless because tonnage balance is so far out.

Whats the point of ELO when you're got a team of half lights with the remainder mediums and heavies facing off against half a team of Assaults, then heavies and mids.

Regularly I'm seeing matches with 2-300 tonnes difference between the teams. This leads to the inevitable steamrollering going on latley.

ELO, nice idea for one on one matches. Sucks for 12 -v- 12.

Mal


By reading this post I can assure that you have more posts than matches played.

Malino
  • Posted Image
  • Members
  • 7 posts

Edited by Mutaroc, 15 November 2013 - 05:37 AM.


#163 KinLuu

    Member

  • PipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 1,917 posts

Posted 15 November 2013 - 05:41 AM

View PostRoland, on 15 November 2013 - 05:01 AM, said:

I don't think this is actually true. It shouldn't take hundreds of games for elo to account for your skill. It doesn't in other games. If elo is working, then your initial games should boost you up fairly significantly.

I suspect that in two it does take hundreds of games, but this is merely illustrating that the system does not in fact work correctly. The reason it is taking so many matches is because your individual contribution is being heavily hidden by the noise generated by a 12v12 match.


One never reaches his true elo - because the skill of a player is not constant, but constantly improving.

After a thousand games you should be near your true elo, though.

#164 Duncan Aravain

    Member

  • PipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 416 posts
  • LocationBehind you with a sharp tool...er,mech

Posted 15 November 2013 - 05:41 AM

Come on, liking your own post? Really? What would Roadbeer say?

#165 Kunae

    Member

  • PipPipPipPipPipPipPipPipPip
  • 4,303 posts

Posted 15 November 2013 - 06:02 AM

View PostDuncan Aravain, on 15 November 2013 - 05:41 AM, said:

Come on, liking your own post? Really? What would Roadbeer say?

WWRBS?

View PostRoadbeer, on 11 November 2013 - 01:26 PM, said:

Welcome to K Town!

have some Premade Cake?





#166 Roadbeer

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 8,160 posts
  • LocationWazan, Zion Cluster

Posted 15 November 2013 - 07:32 AM

You guys... :P

#167 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 15 November 2013 - 03:50 PM

View PostRoland, on 15 November 2013 - 05:01 AM, said:

I don't think this is actually true. It shouldn't take hundreds of games for elo to account for your skill. It doesn't in other games. If elo is working, then your initial games should boost you up fairly significantly.

I suspect that in two it does take hundreds of games, but this is merely illustrating that the system does not in fact work correctly. The reason it is taking so many matches is because your individual contribution is being heavily hidden by the noise generated by a 12v12 match.


It works but again, remember - you gain points for wins and lose points for losses. This means that in a match of 10 games if you win 7 and lose 3 and are gaining 5 points per match you've only gained 30 points. The higher your Elo score the fewer people of comparable skill to play with/against and so you're likely to be playing opponents with a lower mean Elo than you so your wins start gaining you 4 points instead of 5 and your losses costing you 6 instead of 5, so that same 10 matches winning 7 and losing 3 would only net you 20 points instead of 30.

Your performance accounts for ~8.333% of your teams success or failure so yeah, it takes more matches to sort you out than if you're in a 5 man team accounting for 20%. It's still possible, just takes longer. Also, as Grits N Gravy pointed out earlier, the Elo population curve and k-score award exacerbates the problem. I'd vote in favor of a gaussian distribution (essentially you make the middle ~87% of population distribution more gradual when awarding or removing points, this makes the population across Elo bands a bit thicker and more even save for the most scores) and reduced k-score (modifier in points and point awards for winning/losing) to fix the issue.

Population fixes a lot as well. If there were a million players online at any given time you'd see a far, far more balanced match. The idea above helps fudge this a bit with lower player populations and IMO is a solid idea. I agree that we want a narrower skill band for matchmaking - the question is how to do that and match weight and do it within 120 seconds with 24 people.

#168 Kunae

    Member

  • PipPipPipPipPipPipPipPipPip
  • 4,303 posts

Posted 15 November 2013 - 05:38 PM

@MischiefSC

I respect the concept of what you're asking for, but I really don't feel like accumulating what you suggest.

I am not some little ADD kiddie who has one bad match, or a couple, and comes onto the forums to vomit my teen angst here. My concerns are based on the trend in matches that I see, and you can take that for what it is. I know what matches were like, in general, before August, and I know where they're at now, in general.

The trend is not promising.

#169 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 15 November 2013 - 07:35 PM

View PostKunae, on 15 November 2013 - 05:38 PM, said:

@MischiefSC

I respect the concept of what you're asking for, but I really don't feel like accumulating what you suggest.

I am not some little ADD kiddie who has one bad match, or a couple, and comes onto the forums to vomit my teen angst here. My concerns are based on the trend in matches that I see, and you can take that for what it is. I know what matches were like, in general, before August, and I know where they're at now, in general.

The trend is not promising.



I know that, which is why I recommended you accumulate that data. Roadbeer already does and can confirm it for you if you'd like.

Here's the killer issue though for trying to solve the problem -

Suppose MW:O only has 1000 people online right now wanting to get into a match. Average match length, for simplicity of math, we'll say is 8 minutes. So every time someone gets into a match they can't be tapped for a new one for 8 minutes. We do know the matchmaker works in two-minute intervals - it tries for 2 minutes to get people into a match, after that it just puts you in whatever is closest.

You're pugging, and you hit launch. Here is what happens:

Step 1 - Of that 1,000 people, out of every 10 minutes they are in a match for 8 of them. So in the two-minute interval to find you a match 80% of players, or 800 of them, are already in a match.

So it's you and 200 people to choose from to pack a 24 person match. That can't be too bad, right?

Step 2 - Suppose you're in the top 20% of players, so it tries to find people relatively close to you. That doesn't mean 20% and up! It means people within a certain range of you. We do know however that the higher the Elo, the fewer people are around. So statistically about 70% of players (at least according to the last Elo score map PGI released) are in the 40-60% range. it scales down the lower end at the same rate it scales down the higher end. That means that only about 15% of players are within 10% of your skill level.

So all of a sudden we're down to 20 total potential players for a 24-man match.

Step 3 - You're also not the only match drawing right now. With weight-matching considerations your 20 player clutch is now closer to 10. So at this point some command decisions have to be made. The players in the even higher Elo group than you have to match as well, does it throw them all to the wind or does it pull you guys in to help fill their matches? This is a problem with the steep curve as you gain Elo, there are fewer people above you to draw from so it's easier to populate matches by pulling people from below you.

Step 4 - Polishing weight matching. There are literally not enough players at your Elo to fill a match so you're parsed into higher or lower Elo games but kept at least as close as possible but with the best weight-matching possible.


Result? In the current model, tighter Elo matching means looser weight restrictions. Based on the player population hitting 'launch' in any 120 second interval, sometimes a tonnage or skill mismatched game is the best option possible.

The only thing that's changed since August is that you're statistically more likely to drop with and against skilled players than you were before. The higher your Elo though the more likely you are to end up getting pulled to offset a match above or below your personal Elo skill. This is exacerbated by mixed premade and pug Elo - Higher the Elo, more likely everyone premades more than pugs which means when they pug the more out of sync with their actual skill their Elo score will be ergo the more out of sync Elo for the whole match will be.

You probably drop in premades most the time. This means your Elo is based largely on how you perform in a group, so when you pug your experience will be less well balanced. See, if you're in a 4man it treats you like one person - suppose your averaged Elo comes out to 2000. So 4 people, estimated Elo of 2,000. The matchmaker can't fill 24 people ~2,000, so you drop in a match at Elo 1700. It pulls 1 1400 Elo newbie to offset the 4 of you. Everyone else is about 1700. Probably feels like a pretty good game to you.

THEN YOU PUG. It sees your personal Elo at something like 1900. Again, not enough 1900s around to fill a balanced match, so it puts you in a 1700 game. HOWEVER. It again pulls 1 player with Elo 1500 to match you. Same thing happens with every other premade who pugs - when you're in a premade the 4 of you are offset with 1 sub-par player. When you pug, it's 1 for 1.

Does that make sense? So if you're pugging and so are 2 or 3 other players like you, you're going to end up with 3 1900 Elo players (who without coms and friends they trust play at a skill closer to 1700, still great but not the same meatgrinder caliber) and 3 Elo 1500 players who haven't had 100 total matches yet. A quarter of your team is sub-par and the 3 'ringers' the matchmaker put in to offset them are not on coms together and playing solo can't pull the same weight, so you feel like your whole team is full of incompetents. Half of them may be great players just lacking focus and direction, half of them are playing out of their league.

It's fixable but the solution isn't simple. It comes down to population. You can try slicing a pie differently but end of the day it can only feed so many people. What you really need is more pie. Splitting premade/pug Elo and switching to a gaussian distribution will help - I absolutely believe that - but more players and more experienced players is the only long term solution.

So I absolutely believe you Kunae. You are exactly the segment of the population getting the shaft right now. When you premade you're often pulled into matches higher or lower than your Elo, when you pug it swings even more violently - either you don't feel like you matter or you feel like half your team is brand new. The solution isn't removing matchmaking though, it's fixing it.

#170 Nightfire

    Member

  • PipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 226 posts
  • LocationAustralia

Posted 15 November 2013 - 07:57 PM

The current Elo system is designed to give you a personal ability rating based on team performance.
Despite anyone trying to rationalise how Elo really does work at a mathematical level (hint: I really do understand how it works) there is only one mathematical principle that ultimately governs when (of if) you get to your Elo point.

The Principle of Randomness.

Eventually, at no fault of your own, regardless of your ability or agency, you will eventually hit a streak of games, get a series of teammates that work in a similar manner to you, that will put you towards a point that the current Elo can keep putting you with players of similar abilities. It is pure chance.

If you look at how this works, the reverse also applies. Once you are at this "sweet spot" you will at some point be dropped in a series of games that will kick you out of your "sweet spot" and welcome to the roller coaster again.

Using team performance to rank individual ability is asinine. It was mentioned before it was implemented. It was mentioned when it was implemented. I'll say it again now, long after it has been implemented. There is a serious discrepancy between the vision of how this system is supposed to function and how it functions in reality.

For all of you who claim Elo works just fine, save your barbs. I do not deny your reality, I am sure it works well for you. What I would ask is that you look beyond your experience and at least acknowledge that what others say, when they speak of how bad Elo is for them, could be true.

For those that point out that mixing playing with groups and pugs messes up your Elo rating and go on to mention strategies on how to avoid this effect, I say WTF man!? Seriously!? You think that those measures are ok? That moving between pug and groups is something the system should be designed to DISCOURAGE? We deserve better than this!

Instead of rationalising how the system works by creating other systems to make it work with behaviors it should not only allow but ENCOURAGE, how about condemning the system for its flaws and asking for something better? We should fix things such as:
  • Individual ranking based on team performance rather than individual performance
  • Further isolating team and pug play
  • Poor matchmaking (personally, I find Elo WORSE than the pure randomness that was before it)
  • Poor rewarding of extended personal excellence
Why not acknowledge the flaws that you know exist rather than defend a broken system as "better than it used to be"?

Edited by Nightfire, 15 November 2013 - 08:00 PM.


#171 Abivard

    Member

  • PipPipPipPipPipPipPipPip
  • Shredder
  • 1,935 posts
  • LocationFree Rasalhague Republic

Posted 16 November 2013 - 12:55 AM

PGI is trying to pound a square peg into a round hole.

Elo is for individuals playing one on one games, Chess for example ( which it was created for by Arpad Elo).

While an Elo system can be adapted for team play, it does require that the Teams not change composition from one match to another to work as intended!

if you in turn want to use an Elo rating to randomly match players together, you also need a sufficient pool of players. At 24 players to a match MWO needs a very large player base indeed.

The system PGI is using is better off being scrapped. It seems as if simple weight matching would be a vast improvement over what we have now.

PGI has created a host of catch-22 situations for itself, Elo and MM being just one of them.

#172 Ghogiel

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • CS 2021 Gold Champ
  • CS 2021 Gold Champ
  • 6,852 posts

Posted 16 November 2013 - 01:02 AM

View PostRoland, on 14 November 2013 - 07:47 PM, said:

But something must be amiss, or else the elo system would pretty quickly raise you up to the same level, since it is supposed to represent your individual skill.

If elo worked well, then your alt account elo would be the same as your main, unless you are just intentionally taking it down for some reason.

Don't get me wrong, I consistently see the same folks in the game I play too, but you also tend to see folks who clearly have no idea what they are doing.

No it wouldn't.

On both my alts I still have a 200 point reduction due to cadet status.

On both accounts I still run trial mechs

I don't want them to be high elo either, that would miss the point of having them and I'd just have to create a new one anyway.

#173 Ghogiel

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • CS 2021 Gold Champ
  • CS 2021 Gold Champ
  • 6,852 posts

Posted 16 November 2013 - 01:15 AM

View PostRoland, on 15 November 2013 - 05:01 AM, said:

I don't think this is actually true. It shouldn't take hundreds of games for elo to account for your skill. It doesn't in other games. If elo is working, then your initial games should boost you up fairly significantly.

check out the LoL Elo calculator to get a rough idea of where it should be. Which should be a lot of games needed.

If MWO isn't taking a lot of games to score a player it's kind borked.

#174 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 16 November 2013 - 01:29 AM

View PostNightfire, on 15 November 2013 - 07:57 PM, said:

The current Elo system is designed to give you a personal ability rating based on team performance.
Despite anyone trying to rationalise how Elo really does work at a mathematical level (hint: I really do understand how it works) there is only one mathematical principle that ultimately governs when (of if) you get to your Elo point.

The Principle of Randomness.

Eventually, at no fault of your own, regardless of your ability or agency, you will eventually hit a streak of games, get a series of teammates that work in a similar manner to you, that will put you towards a point that the current Elo can keep putting you with players of similar abilities. It is pure chance.

If you look at how this works, the reverse also applies. Once you are at this "sweet spot" you will at some point be dropped in a series of games that will kick you out of your "sweet spot" and welcome to the roller coaster again.

Using team performance to rank individual ability is asinine. It was mentioned before it was implemented. It was mentioned when it was implemented. I'll say it again now, long after it has been implemented. There is a serious discrepancy between the vision of how this system is supposed to function and how it functions in reality.

For all of you who claim Elo works just fine, save your barbs. I do not deny your reality, I am sure it works well for you. What I would ask is that you look beyond your experience and at least acknowledge that what others say, when they speak of how bad Elo is for them, could be true.

For those that point out that mixing playing with groups and pugs messes up your Elo rating and go on to mention strategies on how to avoid this effect, I say WTF man!? Seriously!? You think that those measures are ok? That moving between pug and groups is something the system should be designed to DISCOURAGE? We deserve better than this!

Instead of rationalising how the system works by creating other systems to make it work with behaviors it should not only allow but ENCOURAGE, how about condemning the system for its flaws and asking for something better? We should fix things such as:
  • Individual ranking based on team performance rather than individual performance
  • Further isolating team and pug play
  • Poor matchmaking (personally, I find Elo WORSE than the pure randomness that was before it)
  • Poor rewarding of extended personal excellence
Why not acknowledge the flaws that you know exist rather than defend a broken system as "better than it used to be"?



So if your premise is correct, can you help me understand the difference between an actual pure random matchmaker and Elo then? By your premise probability theory is a scam or at least deeply flawed and as such not a viable mathematical tool for balancing out variables to solve for a single consistent data point.

So how then is Elo inferior to random matchmaking? Statistically there are more average/below average players than there are great players so in a randomly matched game you are more likely to end up with and against less skilled players.

Nobody is talking about how to avoid any particular effects, simply explaining why the current system (using a logistic model and mixed pug and premade Elo) can provide matchmaking that's balanced in the long run but provide wider swings in experience in the short run. The average of 2,000 and 1,000 is 1500 - if both scores are accurate then the result will be reasonably accurate as well just less consistently enjoyable to play. 1600 and 1400 also balance out to 1500 and provide a better experience in the game.

What, exactly, did you enjoy better before Elo? Because in terms of game results the only difference from then to now is that experienced players are less likely to drop in matches with inexperienced players and while there are absolutely exceptions (the polite term is 'variance') the number of players in a match with wildly differing skill levels is way less than it used to be. The result being more challenge match after match regardless of skill level.

It really isn't possible anymore for someone to consistently win 80+% of their games and keep a KDR of 6, 8, even 12. It used to be. Stopping that is exactly what the matchmaker is supposed to do.

So if Elo is worse than random because your Elo score is... random.... please explain what the difference is and why. I'd like to know.

#175 Tahribator

    Member

  • PipPipPipPipPipPipPipPip
  • Fire
  • Fire
  • 1,565 posts

Posted 16 November 2013 - 03:52 AM

If we had proper weight balancing half of the Elo complaints would be gone. I have no idea how hard it is to sort 24 players into equal tonnage teams during the search phase, but apparently it is too hard.

Actually matchmaker does a good job balancing lonewolf tonnage, but it doesn't even attempt balancing the tonnage of premades. So you get one side with even mech distribution and the other with one Atlas and one 733C premade lance totally face smashing the other team.

#176 Nightfire

    Member

  • PipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 226 posts
  • LocationAustralia

Posted 16 November 2013 - 11:00 AM

View PostMischiefSC, on 16 November 2013 - 01:29 AM, said:


So if your premise is correct, can you help me understand the difference between an actual pure random matchmaker and Elo then? By your premise probability theory is a scam or at least deeply flawed and as such not a viable mathematical tool for balancing out variables to solve for a single consistent data point.


I've read a few of your posts and you seem to have a grasp on some research methodologies and mathematics so I have to wonder if this simply escapes you or if you are being deliberately obtuse to cause argument and confrontation. You are equating randomness of being able to be given a team that matches your skill as opposed to pure randomness of every player and inferring that because one is not completely random that randomness plays no large part in it whatever. It is a false equivalence and a logical fallacy designed to silence rather than consider the argument as presented.

So no, probability theory isn't a scam but when you change a control variable based on its assumed partial influence (rather than one that displays a consistent, direct and proportional influence) on a correlating variable you have a flawed methodology. In fact your control variable (player skill indicated by Elo) can have 0 (or even a negative) impact on the variable you assume to be related (Match won).

That is, people who perform (even exceptionally) well but whose team loses or opposite. I've seen people go AFK at the start of a match and the team win (absolutely no effect on outcome) and I have seen players single handed kill 6 of the enemy and still lost (no effect on outcome). I will also accept that these examples are single samples.

I know the argument; that in extended samples the player's skill should more often correlate to the influenced variable, the outcome of the game. That probability would dictate over time that the pattern of better play emerges.

Here is the stinking point, it doesn't! ... necessarily. An average individual player has no significantly significant influence on the outcome of a MWO match. The team as a collective whole performs against another team as a collective whole. There are far more variables that are not measured that will define if they can work together better than the other team. This is the Achilles heel of the whole Elo system applied to dynamically formed teams.

You are attempting to create one variable (and use this as a control variable) by the outcome of the interaction of a host of other unmeasured variables opposed by the accumulation similar, unmeasured variables that change from match to match. That is, your control variable (player skill) is not directly linked to your measured variable (win/loss) which is influenced by your variable (player skill) and a host of other unmeasured variables.
If you proposed this as a research methodology I'd reject your proposal and send you back to the statisticians so you can work out where you went wrong!

Quote

So how then is Elo inferior to random matchmaking? Statistically there are more average/below average players than there are great players so in a randomly matched game you are more likely to end up with and against less skilled players.


How is it inferior? Let me start by making it clear I am not advocating a truly random system.

1) In a truly random system, at least each round is either good or bad. In the current system, you are waiting for not just one, but a string of fortuitous matches (wins or losses) that will put you close enough to your rating where you start playing with people enough to learn how they play. Then those other unmeasured variables can stabilise and you will probably start evening out since everyone in that pool gets rotated through both sides. Will skill and Elo is supposed to measure it factor into this? Yeah, to a degree but not to the factor it is intended to. You will end up in the "right" place by pure chance, not by any design.

2) Elo doesn't allow any achievement. This type of game, especially once community warfare kicks in, attracts the competitive crowd. These types of people like to win. You get better to win more and winning more is a sign of progress towards getting better. I'm not necessarily recommending open slather but even brackets allow improving win/loss. The brackets in turn keep newer/less skilled players away from those who are better; movement between brackets becomes an indicator of achievement. In this specific implementation Elo is hidden and as mentioned before, means nothing because it is tied to win/loss rather than directly to player performance.

I have another point but I will address it after this point as it leads in.

Quote

Nobody is talking about how to avoid any particular effects, simply explaining why the current system (using a logistic model and mixed pug and premade Elo) can provide matchmaking that's balanced in the long run but provide wider swings in experience in the short run. The average of 2,000 and 1,000 is 1500 - if both scores are accurate then the result will be reasonably accurate as well just less consistently enjoyable to play. 1600 and 1400 also balance out to 1500 and provide a better experience in the game.


Actually, some people are. I'm not sure if you just missed or chose to ignore those posts of people who have separate "Pub" accounts or Pub in weight classes different from those they Group in. Both of these are techniques designed to work around the fact that a group, being a larger part of a whole (and often more coordinated) has a larger, more direct influence on the outcome of the match. More of those unmeasured variables (such as cohesiveness) are controlled within a group and the effect on the outcome can be more easily observed. What are the obvious issues with this?

A) This metric has the obvious effect of discouraging movement between Pub and Group play. The evidence of which is in the statements of those informing others who to avoid or overcome those Elo alterations while doing both. The fact that discourages people from doing this, starts to create divisions. You are either a "useless pub" or a "pub stomping group", a needless division.

;) The fact that these differences exist is evidence that Elo doesn't measure what it should measure, Player Skill! Your control variable is not directly connected to your measured variable. This is such a fundamental failure of design it begs to ask if the implementers actually understood the problem.

Now to the next point of how Elo is worse:

3) Elo discourages aspects of play that should be encouraged! If you don't get this one, I'm not sure it's worth continuing the discussion.

Quote

What, exactly, did you enjoy better before Elo? Because in terms of game results the only difference from then to now is that experienced players are less likely to drop in matches with inexperienced players and while there are absolutely exceptions (the polite term is 'variance') the number of players in a match with wildly differing skill levels is way less than it used to be. The result being more challenge match after match regardless of skill level.


There are several points in there but I'll just cut to the chase, I'm going to bed shortly.
In the random system everyone knew what the deal was, it was random. You weren't hobbled with a team that routinely did things I thought were insane and even if I was, I could influence that by controlling up to 8 members of my team. I liked playing with all of my friends and that has been a huge loss to the community. The removal of the ability to casually play with up to the full team of players has hurt more than most will know. Teamspeak is a ghost town compared to what it used to be. For a team game, a lot of the teams seem to have quietly slipped away.

(As a side note, I'd really like to know how many of the people who were in closed beta still log in. I doubt we'll ever know but I really am curious.)

Now I know the immediate response to this will be something along the lines of "You just miss pug stomping" to which I will simply point out, that is projection of your own pain. Deal with it elsewhere because it's not helpful and I'm not your scapegoat.

I'll admit we needed some means of regulating play, I've always been in favour of doing so but this wasn't it. Elo wasn't it. Removing larger groups wasn't it. I'll also state right now that separate Pub and Group queues won't be helpful either.
Some things that will? In game communication will. Better and more intuitive in game interfaces will. Why do I have to look where I want my artillery strike? Why can't I open up the Battlemap and click where I want it? Why is there not a dedicated commander role? Why is that not integrated in with the command module? Why can I not easily indicate targets for priority attack? etc.

Quote

It really isn't possible anymore for someone to consistently win 80+% of their games and keep a KDR of 6, 8, even 12. It used to be. Stopping that is exactly what the matchmaker is supposed to do.


I know what it was supposed to do. Removing those indicators of achievement, forcing everyone to appear mediocre and putting in a system designed to do these things that really doesn't do so consistently is incredibly bad design in my opinion. You speak of no one having an 80+% win rate as if having any significant positive win rate is bad. Competitive players like to win, if they are not winning they will figure out how to be better. The answer is not to eliminate positive win rates (the reason being to eliminate negative win rates) but rather to control the deviation.

The trend in competitive play is to provide more statistics, not less. Finally, tying your control variable to measures that do not directly reflect what you are attempting to measure is why Elo provides erratic and inconsistent results.

Quote

So if Elo is worse than random because your Elo score is... random.... please explain what the difference is and why. I'd like to know.


Again, the false equivalence! We both know the Elo score isn't random; it is simply irrelevant to what it is supposed to measure. What is random is finding an Elo point that provides you with enjoyable play. The movement towards this point is random because the influence any individual has on any match is not significant. The only reason that after "enough" games a player will move into an Elo that matches their play style (not simply skill) is given an infinite series, eventually one will find a pattern that matches any given criteria.

Do we need some mechanism for managing matchmaking? Certainly! It is my opinion that Elo is not it. Whatever we have it should be tied directly to player metrics, not to metrics to which the player has no direct control or influence.

#177 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 16 November 2013 - 11:34 AM

View PostNightfire, on 16 November 2013 - 11:00 AM, said:

I've read a few of your posts and you seem to have a grasp on some research methodologies and mathematics so I have to wonder if this simply escapes you or if you are being deliberately obtuse to cause argument and confrontation. You are equating randomness of being able to be given a team that matches your skill as opposed to pure randomness of every player and inferring that because one is not completely random that randomness plays no large part in it whatever. It is a false equivalence and a logical fallacy designed to silence rather than consider the argument as presented.

So no, probability theory isn't a scam but when you change a control variable based on its assumed partial influence (rather than one that displays a consistent, direct and proportional influence) on a correlating variable you have a flawed methodology. In fact your control variable (player skill indicated by Elo) can have 0 (or even a negative) impact on the variable you assume to be related (Match won).

That is, people who perform (even exceptionally) well but whose team loses or opposite. I've seen people go AFK at the start of a match and the team win (absolutely no effect on outcome) and I have seen players single handed kill 6 of the enemy and still lost (no effect on outcome). I will also accept that these examples are single samples.

I know the argument; that in extended samples the player's skill should more often correlate to the influenced variable, the outcome of the game. That probability would dictate over time that the pattern of better play emerges.

Here is the stinking point, it doesn't! ... necessarily. An average individual player has no significantly significant influence on the outcome of a MWO match. The team as a collective whole performs against another team as a collective whole. There are far more variables that are not measured that will define if they can work together better than the other team. This is the Achilles heel of the whole Elo system applied to dynamically formed teams.

You are attempting to create one variable (and use this as a control variable) by the outcome of the interaction of a host of other unmeasured variables opposed by the accumulation similar, unmeasured variables that change from match to match. That is, your control variable (player skill) is not directly linked to your measured variable (win/loss) which is influenced by your variable (player skill) and a host of other unmeasured variables.
If you proposed this as a research methodology I'd reject your proposal and send you back to the statisticians so you can work out where you went wrong!



How is it inferior? Let me start by making it clear I am not advocating a truly random system.

1) In a truly random system, at least each round is either good or bad. In the current system, you are waiting for not just one, but a string of fortuitous matches (wins or losses) that will put you close enough to your rating where you start playing with people enough to learn how they play. Then those other unmeasured variables can stabilise and you will probably start evening out since everyone in that pool gets rotated through both sides. Will skill and Elo is supposed to measure it factor into this? Yeah, to a degree but not to the factor it is intended to. You will end up in the "right" place by pure chance, not by any design.

2) Elo doesn't allow any achievement. This type of game, especially once community warfare kicks in, attracts the competitive crowd. These types of people like to win. You get better to win more and winning more is a sign of progress towards getting better. I'm not necessarily recommending open slather but even brackets allow improving win/loss. The brackets in turn keep newer/less skilled players away from those who are better; movement between brackets becomes an indicator of achievement. In this specific implementation Elo is hidden and as mentioned before, means nothing because it is tied to win/loss rather than directly to player performance.

I have another point but I will address it after this point as it leads in.



Actually, some people are. I'm not sure if you just missed or chose to ignore those posts of people who have separate "Pub" accounts or Pub in weight classes different from those they Group in. Both of these are techniques designed to work around the fact that a group, being a larger part of a whole (and often more coordinated) has a larger, more direct influence on the outcome of the match. More of those unmeasured variables (such as cohesiveness) are controlled within a group and the effect on the outcome can be more easily observed. What are the obvious issues with this?

A) This metric has the obvious effect of discouraging movement between Pub and Group play. The evidence of which is in the statements of those informing others who to avoid or overcome those Elo alterations while doing both. The fact that discourages people from doing this, starts to create divisions. You are either a "useless pub" or a "pub stomping group", a needless division.

;) The fact that these differences exist is evidence that Elo doesn't measure what it should measure, Player Skill! Your control variable is not directly connected to your measured variable. This is such a fundamental failure of design it begs to ask if the implementers actually understood the problem.

Now to the next point of how Elo is worse:

3) Elo discourages aspects of play that should be encouraged! If you don't get this one, I'm not sure it's worth continuing the discussion.



There are several points in there but I'll just cut to the chase, I'm going to bed shortly.
In the random system everyone knew what the deal was, it was random. You weren't hobbled with a team that routinely did things I thought were insane and even if I was, I could influence that by controlling up to 8 members of my team. I liked playing with all of my friends and that has been a huge loss to the community. The removal of the ability to casually play with up to the full team of players has hurt more than most will know. Teamspeak is a ghost town compared to what it used to be. For a team game, a lot of the teams seem to have quietly slipped away.

(As a side note, I'd really like to know how many of the people who were in closed beta still log in. I doubt we'll ever know but I really am curious.)

Now I know the immediate response to this will be something along the lines of "You just miss pug stomping" to which I will simply point out, that is projection of your own pain. Deal with it elsewhere because it's not helpful and I'm not your scapegoat.

I'll admit we needed some means of regulating play, I've always been in favour of doing so but this wasn't it. Elo wasn't it. Removing larger groups wasn't it. I'll also state right now that separate Pub and Group queues won't be helpful either.
Some things that will? In game communication will. Better and more intuitive in game interfaces will. Why do I have to look where I want my artillery strike? Why can't I open up the Battlemap and click where I want it? Why is there not a dedicated commander role? Why is that not integrated in with the command module? Why can I not easily indicate targets for priority attack? etc.



I know what it was supposed to do. Removing those indicators of achievement, forcing everyone to appear mediocre and putting in a system designed to do these things that really doesn't do so consistently is incredibly bad design in my opinion. You speak of no one having an 80+% win rate as if having any significant positive win rate is bad. Competitive players like to win, if they are not winning they will figure out how to be better. The answer is not to eliminate positive win rates (the reason being to eliminate negative win rates) but rather to control the deviation.

The trend in competitive play is to provide more statistics, not less. Finally, tying your control variable to measures that do not directly reflect what you are attempting to measure is why Elo provides erratic and inconsistent results.



Again, the false equivalence! We both know the Elo score isn't random; it is simply irrelevant to what it is supposed to measure. What is random is finding an Elo point that provides you with enjoyable play. The movement towards this point is random because the influence any individual has on any match is not significant. The only reason that after "enough" games a player will move into an Elo that matches their play style (not simply skill) is given an infinite series, eventually one will find a pattern that matches any given criteria.

Do we need some mechanism for managing matchmaking? Certainly! It is my opinion that Elo is not it. Whatever we have it should be tied directly to player metrics, not to metrics to which the player has no direct control or influence.


I really appreciate you making a well thought out response.

The phone gist of probability theory is that all the random variables injected by your 11 teammates will, over time, balance out. More variables, more time it takes. More time being, more precisely, more matches. Matches = more data.

While you don't know the relative Elo of the players in a match the matchmaker itself does. Thus it's able to use that data to award or remove points from a player based on their performance as an aggregate over numerous game.

The point that you come back to and that others have come back to is that 'a players influence on a match is insignificant'. This is absolutely and patently false. You're ~8.333% of your teams performance. In a single match the pendulum swing of probability can wash a good game out right along with a bad game. The difference is that 1 game in 12 where your specific performance was the deciding factor. Or, more to the point, over those 12 games your overall influence on the probability of winning/losing the match. The criteria by which the matchmaker pulls other people into your matches is the exact same criteria by which it does so for everyone else. Waves rise, waves fall, ocean stays the same. You're swimming in the same ocean as everyone else. In the short term it may feel like you're all swimming and going nowhere but over time those who swim harder and faster absolutely will end up further ahead while those who don't will lag further behind.

You are exactly correct that a completely average player really isn't going to affect the win/loss rate of his team. That is in fact exactly the point - their Elo, their win/loss, will remain neutral. Put more specifically you will either win more games than you lose or conversely lose more games until you win until you're routinely matched with players with and against whom you'll have a 50/50 win/loss rate. For the broad majority this is exactly what does happen. For a small percentage at the top they'll settle in at a point where they win more than they lose and stay static, at the bottom there will be a group who loses more than they win and stay static. There's a narrow band right under the 'win more' band that will actually lose more and stay static and a tiny band right above the 'lose more' that wins more and stays static.

The point of splitting pug and group Elo is for the exact reasons you've stated - you play differently and perform differently when playing in an organized group than when you pug. Thus your corresponding performance is different, that's why the Elo scores for them need separated. To make Elo more accurate.

As to allowing larger player groups - grouping provides a significant advantage. I'm all for it but ideally groups should play against groups. I'm all for bringing back 8mans though. Easier to find than 12. We had the matchmaking you're talking about before and most people hated it. search 'premade' on the forums from before August. People hated it more than they hated 3PV. For a reason and a good one. That won't be making a return, nor should it.

So let me ask you this - do you win more matches than you lose? Yes or no. Do you win or lose more matches now than you did before Elo came in? I know that I, Roadbeer and several other people have been tracking that statistically and for us at least we can absolutely see the impact that Elo has had in the aggregate. If you are a statistical aberration then show it. Track some games and show how you're consistently losing more than you're winning while still getting tons of kills and damage every match. Not cherry picked but 20, 50, 100 consecutive games.

As to basing score on something other than win/loss -

Can't happen. Every single other metric is easily skewed. Make it match score? Fine, I'll boat LRMs or LBX and SRMs and while I'll win less games I'll do a higher average amount of damage and component destruction. KDR? No problem, PPC/AC sniping and kill stealing with push me up the charts. Win/loss is less precise a metric to measure personal performance but it's the only trustworthy one. Over enough games your impact on the performance of your team can be measured. More matches, more data, more precision.

I'm all for making Elo public as well. The bracket system you're talking about would require tracking peoples win/loss - you're literally just talking about a clumsier version of Elo. So make Elo public, hell have the end of round screen show the Elo impact of the match and give relative Elo scores next to each player in the game.

Elo is tied to player performance and metrics - win/loss. Which you can and do impact. I absolutely get the desire to have something specific that you can play towards. Damage, score, kills. Something you can plan and build towards. The problem is that it only rewards a specific behavior as I mentioned before and is absolutely going to get abused and at the end of the day...

It's winning that matters. How much do you do, every single game, to secure the win. That's what Elo measures. It does exactly what probability theory and statistical modeling says it should do and it does it as accurately as is possible given the criteria and population density.

#178 Roadbeer

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 8,160 posts
  • LocationWazan, Zion Cluster

Posted 16 November 2013 - 11:45 AM

All I have left to add to this discussion.

suc·cinct: adj

1. marked by brevity and clarity; concise
2. compressed into a small area

#179 Sug

    Member

  • PipPipPipPipPipPipPipPipPip
  • The People's Hero
  • The People
  • 4,629 posts
  • LocationChicago

Posted 16 November 2013 - 11:50 AM

View PostMischiefSC, on 16 November 2013 - 11:34 AM, said:

How much do you do, every single game, to secure the win. That's what Elo measures.


I came late to the conversation. Where do you explain how anything but a win or a loss matters to Elo? So my 4 kills and 600 damage actually count for something now in a game I lost? News to me.

#180 Aluminumfoiled

    Member

  • PipPipPipPipPip
  • 189 posts
  • LocationErehwon

Posted 16 November 2013 - 11:54 AM

View PostRoadbeer, on 11 November 2013 - 06:55 PM, said:

...
Armor: 1.22 (The armor benchmark is the average damage done / total armor on the mech)
Firepower: 1.13 (The Firepower benchmark is the average damge done / the firepower of the mech x10)
...

Very interesting use of stats. Going to do mine total and per mech. Might lead in neat directions. Thanks for that.





3 user(s) are reading this topic

0 members, 3 guests, 0 anonymous users