Jump to content

What Do We 'know' About Elo..


87 replies to this topic

#81 Dahnyol

    Member

  • PipPipPip
  • Moderate Giver
  • 71 posts

Posted 25 September 2014 - 01:49 PM

View PostAgelmar, on 25 September 2014 - 01:47 PM, said:



Some brave soul took screenshots from end of match screens from people for over a month this summer. Over 18,000 unique names from his data.

So slightly more than 4,000.


Damn, kudos to that guy!

But seriously, divide that by timezones? I mean that still is a small ass sample size, and that was taken in summer for the peak population...

#82 Asyres

    Member

  • PipPipPipPipPipPip
  • 433 posts

Posted 25 September 2014 - 01:54 PM

View PostDahnyol, on 25 September 2014 - 01:49 PM, said:

Damn, kudos to that guy!

But seriously, divide that by timezones? I mean that still is a small ass sample size, and that was taken in summer for the peak population...


It's probably safe to assume that you shouldn't divide by timezones, since it's unlikely that a given player would play around the clock.

#83 EgoSlayer

    Member

  • PipPipPipPipPipPipPipPip
  • Wrath
  • Wrath
  • 1,909 posts
  • Location[REDACTED]

Posted 25 September 2014 - 09:23 PM

View PostJorgandr, on 25 September 2014 - 10:32 AM, said:


Perhaps we are talking about completely different things. "Personal" failure has nothing to do with anything as far as I'm concerned, and as relates to MM. In a match with 12 vs 12, a single mech (me) is not going to make or break the game, and the chances of myself being placed on a team of nothing but new players while the other team has nothing but vets is so small that it should not even be considered.

Anyway, my personal complaint with matchmaker is that it attempts to match things based on completely arbitrary values, and is not the least bit transparent to us, the players. It also makes no attempt to take variables into account which are very quantifiable (# of ECM on each team for instance).

In my opinion, there is no way that the ELO system is accurate given the above. And in my personal experience, the MM system seems to throw people at each other randomly at best, while trying to respect the 3/3/3/3 system. I see just as many (C)'s on my screen as I do people I've watched in 12v12 team games on twitch/youtube. Clearly we can't ALL be at the same skill level.

Dunning Kruger is what Bilbo is talking about:
http://en.wikipedia....93Kruger_effect

And you're wrong that your one mech out of 12 doesn't make or break the game. in the simplest terms, you are 8.33% of the fighting force. Likewise each person on the opposing team is 8.33% of the fighting force. The team with the most players that meet or exceed that percentage is the team that wins. And that's how elo works in team games, if one player is consistently 15% relative to the rest of the team/opponents being 8% that person will win more. Not just one match, but over dozens or hundreds the law of large numbers applies.

So couple Dunning Kruger with a long set of matches required for normalization and a lot of people think elo doesn't work. But it does, the math is decades old proven. And it's been used for decades for rating things like FIFA, NFL, NCAA football and basketball teams, etc.

Edited by EgoSlayer, 25 September 2014 - 09:25 PM.


#84 Ghogiel

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • CS 2021 Gold Champ
  • CS 2021 Gold Champ
  • 6,852 posts

Posted 25 September 2014 - 11:45 PM

Well there is something to be said to in regards to dimishing returns for carrying harder in the PUG queque. Though the info is incomplete because we don't know everyone who has maxed Elo. But those that have been known to have maxed Elo or near maxed Elo have all played in groups. A lot. I really wonder if a player who has never played in groups has gotten near max Elo, cos getting to max seems pretty easy to me when it's not uncommon to finish a 5-6hr session with 1-2 losses. Maybe even a clean sheet. I doubt anyone is doing that in PUGs..

Edited by Ghogiel, 25 September 2014 - 11:45 PM.


#85 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 26 September 2014 - 12:56 AM

View PostJoseph Mallan, on 25 September 2014 - 09:42 AM, said:

Elo also cannot predict the impact Attrition will have on a Match.

Team A stays together, Team B has 2-3 Leroy's. They run into the enemy and you are now down 3... HOW do you recover? Your answer determines how good Elo is.


What's funny is that Elo works best for people like you, Joe.

Those 2-3 Leroys either do that all the time, in which case Team B has a lower Elo than Team A - that means the Matchmaker knew those 2-3 idiots were sandbaggers and doesn't reduce team Bs Elo for losing. It's like getting assigned 2-3 fuckups to your squad; you're either going to carry for them or you're going to just go into it expecting to win without them. Elo is like a competent CO; he knows who the fuckups are and he assigns them to squads who can carry them or, privately, realizes that he has to put them somewhere while they either get them demoted into some place they can't do harm or, well, if it's the Army get them promoted into a position matching their incompetence :P.

So while he stuck them with your squad he's not going to come down as hard on you guys because you're slow to get your tasks done since end of the day it's like you're 3 men short. He's got to stick them somewhere however.

Conversely Elo is *designed* to constantly put you against challenges it thinks you're not going to win. Why? To give you a chance to win against the odds and prove it wrong. That is how your Elo improves, Joe. When it gives you the 3 Leroys and you still pull it out you don't just get to pat yourself on the back, Elo strokes it's beard and says "Huh. No ****. Good job boys, here's some more Elo for you. You now get to play with/against a better caliber of people."

Elo shines for people like you Joe. Yeah, sometimes it stacks the odds in your favor - you're the challenge for someone else to overcome. Sometimes it stacks them against you so it can see if you beat the odds. It's not about making things balanced though so much as maintaining a consistent level of challenge you're forced to overcome.

#86 Jorgandr

    Member

  • PipPipPip
  • 93 posts

Posted 26 September 2014 - 03:39 AM

View PostEgoSlayer, on 25 September 2014 - 09:23 PM, said:

Dunning Kruger is what Bilbo is talking about:
http://en.wikipedia....93Kruger_effect

And you're wrong that your one mech out of 12 doesn't make or break the game. in the simplest terms, you are 8.33% of the fighting force. Likewise each person on the opposing team is 8.33% of the fighting force. The team with the most players that meet or exceed that percentage is the team that wins. And that's how elo works in team games, if one player is consistently 15% relative to the rest of the team/opponents being 8% that person will win more. Not just one match, but over dozens or hundreds the law of large numbers applies.

So couple Dunning Kruger with a long set of matches required for normalization and a lot of people think elo doesn't work. But it does, the math is decades old proven. And it's been used for decades for rating things like FIFA, NFL, NCAA football and basketball teams, etc.


I am well aware of that. FIFA, NFL, NCAA, etc are all set teams that do not change from game to game. Players may be swapped now and then, but by and large the team as a whole stays the same. The variables in play are pretty much static. Works great in this situation.

You CANNOT compare this to pick-up games. You cannot use the same method you use for scoring static teams, to score individual players in pick-up games. If they want even matchups (ESPECIALLY FOR NEWER PLAYERS), then they MUST use a realistic matching system for PUGs. The number of matches before it starts making any sense at all for an INDIVIDUAL player (because that is all that matters here. There is no such thing as a team in a pug game), especially with a random assortment of players, is in the thousands, not 12. This is far too much for the solo queue with new players coming in and out all the time. (New players are the life-blood of this game. Spurn them and the game dies.)

I know I am personally 1400 matches in and still most of what I see is stomp after stomp (12-1/12-0 games) with all gametypes turned on. I keep track, this is not remembering only the outliers. My current average is 4 roflstomps to every 1 stomp (12-2/12-4) to every 1 close game (12-5/12-6). Truly close games of 12-9 and up are rare gems that I must cherish and remember, because lord knows when I'll see one again. My win/loss ratio tells me I'm on the stompy side more often than the stomped, so this is not "I lose too much because matchmaker". This is "This game is not fun if every game is stomp or be stomped". Stomping the other side is not fun. I'd rather play vs AI.

tldr: I'm 1400 matches in, and matchmaker still can't figure out what to do with me. I'd be willing to guess that new players don't last this long. The only reason I did is that I'm a die-hard battletech fan, but even I find myself playing less and less. In other words Elo does squat for pug games. (Trying to sell digital mechs for the same price of an ENTIRE GAME without adding any new game-modes and a new map once a year doesnt help either, but that is neither here nor there).

CAPS because everyone seems to be spouting the same "It works because static team" nonsense.

Edited by Jorgandr, 26 September 2014 - 04:20 AM.


#87 EgoSlayer

    Member

  • PipPipPipPipPipPipPipPip
  • Wrath
  • Wrath
  • 1,909 posts
  • Location[REDACTED]

Posted 26 September 2014 - 05:06 AM

View PostJorgandr, on 26 September 2014 - 03:39 AM, said:


I am well aware of that. FIFA, NFL, NCAA, etc are all set teams that do not change from game to game. Players may be swapped now and then, but by and large the team as a whole stays the same. The variables in play are pretty much static. Works great in this situation.

You CANNOT compare this to pick-up games. You cannot use the same method you use for scoring static teams, to score individual players in pick-up games. If they want even matchups (ESPECIALLY FOR NEWER PLAYERS), then they MUST use a realistic matching system for PUGs. The number of matches before it starts making any sense at all for an INDIVIDUAL player (because that is all that matters here. There is no such thing as a team in a pug game), especially with a random assortment of players, is in the thousands, not 12. This is far too much for the solo queue with new players coming in and out all the time. (New players are the life-blood of this game. Spurn them and the game dies.)


You don't understand the difference here. The pro sports teams are rated as a team elo. The players don't have individual elo. MWO the players each have their own elo which is used to calculate the team's elo. And pro sports teams *DO* change line ups, for example injuries during the game. Don't think that starting QB going down or losing the start RB affects the team's elo? Same principal, a "star" MWO player will lift the team up. It just takes a large data set of matches for the effects to be statistically relevant.

View PostJorgandr, on 26 September 2014 - 03:39 AM, said:

I know I am personally 1400 matches in and still most of what I see is stomp after stomp (12-1/12-0 games) with all gametypes turned on. I keep track, this is not remembering only the outliers. My current average is 4 roflstomps to every 1 stomp (12-2/12-4) to every 1 close game (12-5/12-6). Truly close games of 12-9 and up are rare gems that I must cherish and remember, because lord knows when I'll see one again. My win/loss ratio tells me I'm on the stompy side more often than the stomped, so this is not "I lose too much because matchmaker". This is "This game is not fun if every game is stomp or be stomped". Stomping the other side is not fun. I'd rather play vs AI.

tldr: I'm 1400 matches in, and matchmaker still can't figure out what to do with me. I'd be willing to guess that new players don't last this long. The only reason I did is that I'm a die-hard battletech fan, but even I find myself playing less and less. In other words Elo does squat for pug games. (Trying to sell digital mechs for the same price of an ENTIRE GAME without adding any new game-modes and a new map once a year doesnt help either, but that is neither here nor there).

CAPS because everyone seems to be spouting the same "It works because static team" nonsense.


And here you are just completely wrong. Stomps have *NOTHING* to do with elo and *EVERYTHING* to do with limited units and no respawn. CAPS BECAUSE ALL THE PEOPLE WHO THINK STOMPS ARE A BAD MATCHMAKER DON'T UNDERSTAND THE REAL CAUSE (Combat loss grouping). Once a team is down by a significant percentage of force relative to the opponent they are going to lose in a landslide unless one of two things happen: 1) The team with the advantage is unable to press that advantage and allow themselves to be picked off one at a time or 2) The team with disadvantage is able to strategically create the first situation or create a situation where they rapidly even the odds. E.g. a well executed flanking maneuver that allows them to take out several enemies from behind.

But it's pointless to continue this discussion, because you have already made up your mind and are ignoring facts that disagree with your positions. Like that everyone has the same elo. Elo is a bell curve, and MWO has one just like every other elo system:
Posted Image
http://mwomercs.com/...19#entry2265319

Elo is working, just not how you want/expect it to work.

But hey, once CW is up and running elo goes out the window as a MM criteria so you'll get your wish.

Edited by EgoSlayer, 26 September 2014 - 05:07 AM.


#88 Jorgandr

    Member

  • PipPipPip
  • 93 posts

Posted 26 September 2014 - 10:17 AM

View PostEgoSlayer, on 26 September 2014 - 05:06 AM, said:

...

Elo is working, just not how you want/expect it to work.

...


All those pretty graphs and numbers sure are nice... But that means diddly if we have zero knowledge as to how their matchmaker system places people into matches. They say "it tries to even the scores for each team" Fine. How does it do this? Does it place one Ubermensch on one side and fill it the rest of the way with newbies and call it even? In a situation like that, the scores may be even, and it will certainly look amazing on paper, but the match itself will feel incredibly lopsided, and in the end, be very un-fun for everyone.

And how many new players do you think are going to slog through that huge number of matches before MM can put them in matches with similarly skilled players? That was the major part of my point.

Their system is working? Then why is it putting 4 trial mechs on one side and none on the other? Because their system does not take important things like that into account. It just assigns them a number in the mid-range (which essentially ensures that you will see them no matter how high/low your score is) and calls it quits. There is no way a person who just finished installing the game is playing at that level. A smart system would at least attempt to place them evenly on both sides. Would be better to just give anyone still in the cadet stage an Elo of 0 and focus purely on placing an even number.

Why do you still see games with 4 ECM on one side and none on the other? Again, they have nothing in place to track such variables. Things like this should not be 100% luck of the draw.

I mean come on, they have a system that can even out drop-weight, but not either of the above?

My main issue that I have perhaps not explained clearly enough, is that their Elo system is the ONLY thing they are using (as far as they have told) to even out matches. This is not nearly adequate for a solo queue. Most stomps ARE due to bad matchmaking. Whether it is due to bad Elo or not is irrelevant. The fact is their system does not account for several very important and easily quantifiable variables.

Yes the total randomness of your matches will taper off after a while, perhaps an extremely long while, but not many new players are going to force themselves to play until that happens.

Edited by Jorgandr, 27 September 2014 - 05:45 AM.






1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users