Jump to content

Elo Based On Win/loss (Or Anything Based On Win/loss) Is Silly


167 replies to this topic

#61 Diego Angelus

    Member

  • PipPipPipPipPipPip
  • The Warden
  • The Warden
  • 471 posts

Posted 09 November 2013 - 06:23 PM

Elo will never work in pugs especially when teams are this big, Elo can kinda work in dota or LoL because you have bigger impact on a match but here not so much.

#62 Wispsy

    Member

  • PipPipPipPipPipPipPipPipPip
  • Talon
  • Talon
  • 2,007 posts

Posted 09 November 2013 - 06:24 PM

View PostVictor Morson, on 09 November 2013 - 06:18 PM, said:





This still seems to discount the "If you are good enough, you entirely dictate if your team wins" statements that started this quite a bit.

I think everyone here can agree ELO doesn't work at all right now, and again, this is the reason I continue to directly attribute. Win/loss in a team heavy game is simply not a stat worth tracking, outside of team-only modes.


I still have bad games in which I die, or get outplayed, or get unlucky, or cannot manage to squeeze out enough awesome to carry the team I have. I am only human. However I have a lot less of those then most people due to the effort I put in. I am not saying you are going to win every game, but if you do not believe you can win every game you are going to lose a lot more when you just give up on your team instead of really trying to use it.

#63 Adiuvo

    Member

  • PipPipPipPipPipPipPipPipPip
  • The 1 Percent
  • The 1 Percent
  • 2,078 posts

Posted 09 November 2013 - 06:25 PM

View PostVictor Morson, on 09 November 2013 - 06:18 PM, said:

I think everyone here can agree ELO doesn't work at all right now, and again, this is the reason I continue to directly attribute. Win/loss in a team heavy game is simply not a stat worth tracking, outside of team-only modes.

Assuming that the proper population is on, the fact that I drop with the same people over and over and over again indicates that it does something right...

#64 Navy Sixes

    Member

  • PipPipPipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 1,018 posts
  • LocationHeading west

Posted 09 November 2013 - 06:30 PM

In regard to rating myself based on win/loss:

In terms of dmg, I usually rank in the top half of my company (solo PUG exclusively) but just (6th, 5th, or 4th). Aside from a few awesome outliers (and a few stinkers) this is consistent whether I win or lose. When I perform really well, there are usually 3-5 teammates who perform still better. Conversely, when I lose and have a bad game with low damage, there are still usually 6-8 other players who performed even worse.

All of this has led me to believe that, in my own case, ELO is working more or less correctly. My stellar performance contributes to the better players as well as the players who didn't do as well, and we win. My stinky performance drags the rest of the team down, and we lose. Each player on the team contributes to a win and each other's statistical performance at the same time.

There are, of course, times when I play really well and the enemy still wins. But there are also times when I play a terribad round and the rest of the team gets the win without me. These are not the norm, however.

I think my win/loss rate is probably a solid reflection of my performance in game, as solid as any other metric, anyway.

#65 Victor Morson

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 6,370 posts
  • LocationAnder's Moon

Posted 09 November 2013 - 06:38 PM

View PostAdiuvo, on 09 November 2013 - 06:23 PM, said:

You're referencing a lot of things as based on luck when they're in reality skill based, and able to make an impact on matches.

How do you explain people who have high solo drop win rates? Is it just magical luck? If it was literally everybody would have a 1.0W/L.


While the average would be 1:1 W/L, that's now how odds actually work. There will be a number of people who have much higher or lower than that. And yes, I even know a few pilots who more frequently than some can turn the tide of a battle.

But as I've already said, even if you had a pilot who could do that with REMOTE consistency, you'd just be stacking the deck the other way around instead.

That said the majority with higher than 1:1 W/L are coming in as a team. I honestly find it really amusing people want to believe that one person can consistently make or break a game. Keep in mind these points:

1: MechWarrior is a team game and the tighter the team interaction, the more deadly they are
2: When dropping solo you have literally no control over your team and they are being brought from really random ELO levels
3: You stand a chance of running into two or more groups of people who do have some control over their team
4: You have no control over weight matching so in at least a quarter or more of your games you'll end up with a team with horrible weight disparities
5: You have no control over the terrain or team you are fighting against either
6: After all this is said and done, you account for just over 8.2% of your team. This leaves 91.8% of your team's ability entirely out of your control.

Bonus point related to 4: Half your team brings freaking Locusts. The other team brings Highlanders.

Sure if you're a crazy good sniper that kills things fast and early you'll help your team, again, and might even influence more wins than a typical pilot but ultimately you're going to hit a ton of losses entirely out of your control.

Now, say we dispel with all the Rambo fantasies and why they don't actually matter in this context, what do you think is going to happen to the typical pilot?

Scenario:
John Doe - new pilot - 50 drops. No idea what he's doing. His win/loss ratio is close to 1:1 thanks to his various teams, so surely he must be doing alright in his flamer-MG Dragon right? Clearly if he has a 1:1 against expert pilots, he's fit to match with expert pilots. ELO, great success!

Edited by Victor Morson, 09 November 2013 - 06:43 PM.


#66 Too Much Love

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 787 posts

Posted 09 November 2013 - 07:11 PM

View PostVictor Morson, on 08 November 2013 - 03:16 PM, said:

Figuring ELO based on win/loss like that person had any chance to make or break the team (this is even more noticeable since 12 mans; in an 8 man you were 12.5% of the team so could make a reasonable dent) is just going to result in pure mud for numbers.

Since I found this out I've started to realize why ELO has degraded so badly and is so worthless right now, with elite/veteran pilots getting thrown in with newbies even if there are enough newbies on to sort them into their own games and vice versa.

So my point is.. just stop doing this. Start balancing it on damage done per drop. Points captured. Recon targets scouted. Savior kills & assists.. these things actually could help gauge a pilot. Basing it around if the team dies is just wrong.
Victor, I just wanted to start exactly the same topic but You made it first. I absolutely agree with Your arguments.

In fact, it's obvious that the current ELO system simply can't provide normal matchmaking. It can be seen not only from practice, but even from reading dev's explanation of ELO system in developers corner.

It's not hard to understand the main flaw of ELO system: it's a vicious circle. First, based on your pre-supposed ELO matchmaker chooses the team for you, then based on the team performance it calculates your ELO, then based on this ELO it once gain chooses the team for you and the story repeating. So, in other words,first it decides the outcome of the match, then judging by this outcome, it calculates your skill level. Isn't it ridiculous?

ELO works right only for individual games, like chess. It can work for the teams, but only if the composition of the teams remains the same.But when the team members are picked by chance every time ELO is useless.

In the first years of the Soviet Union there was an experiment in schools: they abolished the individual ranking system and estimated the education level of every pupil averaging it with the performances of the whole school. Needles to say, the experiment failed and was abolished very soon.

ELO system in MWO is even more useless: at least pupils in the Soviet schools haven't been swapped every 10 minutes.

We don't get jobs based on our neighbors average salary, so why do we get ELO calculated from the performance of 11 strangers in the previous match, picked by the ELO system?

#67 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 09 November 2013 - 07:13 PM

View PostVictor Morson, on 09 November 2013 - 06:18 PM, said:


Except you just disproved your own point.

If a pug is statistically as likely to appear on Wispy's four man murdersquad, as against it, that means they are going to get a 1:1 w/l ratio of every game they play with Wispy if they lucked out and got a perfect 50/50 split going on and off his team.

Rendering ELO from wins moot. Because you just noted in this scenario it's pure luck if you end up on the team with the winning group.



Or completely incompetently programmed.



This still seems to discount the "If you are good enough, you entirely dictate if your team wins" statements that started this quite a bit.

I think everyone here can agree ELO doesn't work at all right now, and again, this is the reason I continue to directly attribute. Win/loss in a team heavy game is simply not a stat worth tracking, outside of team-only modes.


In short, no.

My reference to the solo player being as likely to drop with Wispy and the 'murdersquad' being as likely to drop against them points statistically to the fact that who you drop with is irrelevant statistically because you have the same odds as every other pug of doing so. That some people have higher or lower win/loss rates equate to their ability to drive a win or a loss given that statistically everything else equals out. Suppose you drop with and against Wispys team 1,000 times, 500 with and 500 against. Suppose that if you were not there and every other player was completely balanced Wispys murdersquad would win 650 of them but you got jumpsniping down to a science or are a virtuoso with the 2D2 streakhawk. Your performance would skew those results giving a different number of wins/losses by dint of your skill, or lack of it. In the above instance you might push the murdersquads wins down to 500 again, or even 400. You might literally help your whole team improve. Maybe you don't get a single kill but become incredibly adept at disrupting their tactics, giving your team better odds of winning. Does this make sense? This is the whole fundamental principle of odds and statistics. This is exactly WHY there are bookies and gambling and statisticians.

The truest indicator of how good a player is at helping their team win, no matter how they do it, is how often they help their team win.

Once again Victor, I don't know how more simply I can put this:

The only consistent factor in the matches YOU play is YOU. Everything else is equal. Every other aspect of the game, your team the other team the map you get the internet connection it's all statistically random. A coin toss. You have the same odds of getting a 'good team' or a 'bad team' as anyone else does. Your skill at promoting a win or a loss influences, to some greater or lesser degree, the results of that coin toss. While irrelevant over 10 tosses it is absolutely relevant after 100 tosses and easily identifiable after 500 tosses.

Statistically if it was all random we'd all have 1.0 win/loss. We don't. That's because our skill with the mech/chassis influences those results. That is the best, most reliable indicator of relative skill between players. As I've gone over repeatedly in other posts all the other factors (damage, score, KDR, assists) can be gamed. Win/loss really can't. Best you can do is play 4man caprush which will simply result in you moving to an Elo band where that's all you can do since if you try to fight you'll get slaughtered. It's self-balancing, consistent and measurable.

I can understand the desire to have a more perceptible and relatable metric be how you gauge your relative success. Damage, score, KDR, whatever. These things don't always drive wins though. Winning drives wins. No matter how you do it. You can't 'fudge' a win.

Suppose you took Joe Namath in his prime and put him in a pool of 220 completely random football players who played 100 games of football. Each game splitting up and randomly assembling again into new teams. As one of the greatest QBs in history Broadway Joe would lead whatever group of 11 people he was with to victory more often than any other randomly selected QB. In 10 games luck might make that hard to see or make it seem like a bigger margin than it is. Over 100, even 500 games though you will see far more clearly how likely he is to lead his teams to a win compared to every other random QB in the pack.

Football is 11 people on each team. Without question and in no doubt you can still identify which players on each side are more or less skilled than others and you can without question and in no doubt see where their skill helped their teams win more games than other teams.

Elo works and works fine. We need more players and we need more games played to better settle people in their weight classes. We need to split premade from solo Elo to prevent skewing for 4mans performance when someone pugs.

It works though Victor. The concept is sound, the math is correct and accurate, it's used intelligently and reliably in other games in the same genre. World Football (soccer in the US) uses an Elo system. You want to believe that doing a lot of damage, getting kills, having a good match *deserves* to win. Statistically on average it does - in individual anecdotal examples though it doesn't, not for you or anyone. That's part of the metric though. It happens to everyone. It's statistically irrelevant.

What isn't statistically irrelevant is how often you help bring your team to victory. Everything else is completely equal and thus, statistically, irrelevant.

#68 Victor Morson

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 6,370 posts
  • LocationAnder's Moon

Posted 09 November 2013 - 07:20 PM

View PostMischiefSC, on 09 November 2013 - 07:13 PM, said:

In short, no.

My reference to the solo player being as likely to drop with Wispy and the 'murdersquad' being as likely to drop against them points statistically to the fact that who you drop with is irrelevant statistically because you have the same odds as every other pug of doing so.


Thank you for entirely proving my point.

The typical pug has even odds of dropping with the murdersquad or against it. There you go, you just basically admitted that the system might as well be thrown dice if your ELO goes up or down, separate from the even MORE random factors like a lack of proper weight matching.

Edited by Victor Morson, 09 November 2013 - 07:21 PM.


#69 Victor Morson

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 6,370 posts
  • LocationAnder's Moon

Posted 09 November 2013 - 07:24 PM

View PostMischiefSC, on 09 November 2013 - 07:13 PM, said:

It works though Victor. The concept is sound, the math is correct and accurate, it's used intelligently and reliably in other games in the same genre. World Football (soccer in the US) uses an Elo system. You want to believe that doing a lot of damage, getting kills, having a good match *deserves* to win. Statistically on average it does - in individual anecdotal examples though it doesn't, not for you or anyone. That's part of the metric though. It happens to everyone. It's statistically irrelevant.


Also, the concept is sound. The problem is not the concept of ELO. It's the way it interacts with MechWarrior. And no, the math here is not correct, accurate and least of all intelligent on dealing with it.

For example - ELO in StarCraft? Those are mostly 1v1 or up to 3v3 fights where you have TIGHT control over every single aspect. As a result, it's going to do a pretty good job of gauging your actions and your win loss ratio is everything - how you perform literally does not matter compared to if you win or lose in a game like this.

Likewise, other team games it's used in, it tracks individual performance - not team performance.

Notice that nowhere in this thread do I say "abolish ELO." I say "Make ELO track things within a player's control, and don't punish them for the dozens of random factors completely out of their control." That's a huge difference.

EDIT: Also if you are talking about team-wide ELO with World Football I'd like to remind you these are professional teams that play together all the time. Our closest link is 12 mans - and I'd like to remind you that I've said several times now that 12 mans totally should have a win/loss ELO. Amusingly, 12 mans currently have NO ELO, instead.

In fact let's expand on that example: Are you saying that ELO in a professional sports setting would still be valid if every single match each team was randomly assembled from different skill and physical capabilities, including a ten year old kid, 65 year old man, 4 people who play soccer "on the weekends sometimes", one guy who never played soccer in his life, and a few people who played on regular teams. Would it still be "on the shoulders" of the 3-4 serious players left in the match? Oh yeah, the other team happened to pull in a group of 4 from the best Soccer team in Europe. Would this be fair or good for ANYONE?

Edited by Victor Morson, 09 November 2013 - 07:35 PM.


#70 Too Much Love

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 787 posts

Posted 09 November 2013 - 07:35 PM

View PostMischiefSC, on 09 November 2013 - 07:13 PM, said:

World Football (soccer in the US) uses an Elo system. You want to believe that doing a lot of damage, getting kills, having a good match *deserves* to win. Statistically on average it does - in individual anecdotal examples though it doesn't, not for you or anyone. That's part of the metric though. It happens to everyone. It's statistically irrelevant.

Can I quote myself?

View Postdrunkblackstar, on 09 November 2013 - 07:11 PM, said:

ELO works right only for individual games, like chess. It can work for the teams, but only if the composition of the teams remains the same.But when the team members are picked by chance every time ELO is useless.
World Football League uses ELO because the teams are not PUGs! Can You imagine that the team members of, lets say Real or Manchester United, will be picked by chance every game? For example, Ronaldu will be swapped for some school boy that have equal win/loses ratio as Ronaldu in his backyard? :)

It's funny how ppl try to comapre PUGs to professional teams. The skill level of the professional teams is stable, even more stable then the skill level of the individuals. The PUGs are random ppl playing based on simple average. Can You feel the difference?

Edited by drunkblackstar, 09 November 2013 - 07:37 PM.


#71 Deathlike

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • Littlest Helper
  • Littlest Helper
  • 29,240 posts
  • Location#NOToTaterBalance #BadBalanceOverlordIsBad

Posted 09 November 2013 - 07:37 PM

View PostVictor Morson, on 09 November 2013 - 07:20 PM, said:

Thank you for entirely proving my point.

The typical pug has even odds of dropping with the murdersquad or against it. There you go, you just basically admitted that the system might as well be thrown dice if your ELO goes up or down, separate from the even MORE random factors like a lack of proper weight matching.


There's some randomness alright, but I think you're being a little myopic on this.

Here are some concrete examples on why ELO does work (but not so much the MM):

There are some skill and knowledge based assumptions with ELO involved.

Here's a skill example:

There was a thread on voting where you prefer to shoot an enemy. There were like multiple options... and I don't recall all of them, but one of them was CT.

In combat, if you see a mech that is vulnerable and has a fair chance of having an XL engine (the Jager comes to mind), a higher ELO play will gamble and take the solid risk of trying to side core the mech. There's a chance that the mech does have a Standard engine and thus will make you pay for it. It is a calculated risk after all, but a lower ELO player will very likely attempt to core the mech instead of sidecoring it.

Same concept goes with legging... some people don't well armor their legs, begging to be legged. Someone with higher ELO who notices it on their paperdoll will start going after that mech, despite the potentially high power on the mech (like an Atlas for example). Lower ELO players will keep trying to core it for some stupid reason... making no attempt to potentially trigger an ammo explosion on it (not all mechs fall into this category, but many builds do).

The little things you do for winning matter... even if it doesn't directly translate into wins and losses. A high ELO player wishes to finish off a mech however dirty that solution may be so he can go onto the next target... whereas a lower ELO player doesn't care, and just goes CT and complains when they are legged. Those are differences that are noticeable in ELO brackets (even though we can't see them).

Here's the other side of the coin... those that complain about capping.

High ELO players tend to have an idea that "hey, are we sure we can't be capped?" They go and make sure that is not the case. I swear there's a very obvious issue with capping and I'm sure many people don't like the mechanic and would rather play w/o it. That's fine. The difference is that people who WANT TO WIN does everything in their power to make sure every possible outcome is in their favor. If it involves stopping the cappers and buy time so that reinforcements can come, then so be it. You probably won't live, and you'll probably not get rewarded well (thanks PGI!). However, the little things can be enough to turn the tide. Sometimes a team has overcommitted and capping is the optimal solution. If the option to win is there, it's not an egregious thing to take it outright.

Obviously, you are not guaranteed to win just because you do cap defend or cap the enemy. The question becomes, do you make the right/correct decision more often than not to make the team win? That is what ELO tries to measure. It's not perfect and will never be perfect... but the details on what you do or don't do can and will be a factor in matches where you still matter.

Edited by Deathlike, 09 November 2013 - 07:42 PM.


#72 Too Much Love

    Member

  • PipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 787 posts

Posted 09 November 2013 - 07:51 PM

For those who believe that ELO works great: plz, go read the forum. The latest topics (I took them only from the first page of Game Balance):

1) DID I MENTION MATCHMAKING IS STILL FREQUENTLY TERRIBLE?

2) LOSING STREAK

3) I LOSE NEARLY EVERYGAME.. (FIX MATCHKMAKER PGI)

4) HOW DOES MATCHMAKER ACTUALLY WORK?

It seems that everything is OK, isn't it?

#73 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 09 November 2013 - 07:56 PM

View PostVictor Morson, on 09 November 2013 - 07:20 PM, said:


Thank you for entirely proving my point.

The typical pug has even odds of dropping with the murdersquad or against it. There you go, you just basically admitted that the system might as well be thrown dice if your ELO goes up or down, separate from the even MORE random factors like a lack of proper weight matching.


So you didn't read any of the rest of it? Where I point out exactly what that means statistically? Instead of cherry-picking a sentence and ignoring the paragraph connected to it?

View PostVictor Morson, on 09 November 2013 - 07:24 PM, said:


Also, the concept is sound. The problem is not the concept of ELO. It's the way it interacts with MechWarrior. And no, the math here is not correct, accurate and least of all intelligent on dealing with it.

For example - ELO in StarCraft? Those are mostly 1v1 or up to 3v3 fights where you have TIGHT control over every single aspect. As a result, it's going to do a pretty good job of gauging your actions and your win loss ratio is everything - how you perform literally does not matter compared to if you win or lose in a game like this.

Likewise, other team games it's used in, it tracks individual performance - not team performance.

Notice that nowhere in this thread do I say "abolish ELO." I say "Make ELO track things within a player's control, and don't punish them for the dozens of random factors completely out of their control." That's a huge difference.

EDIT: Also if you are talking about team-wide ELO with World Football I'd like to remind you these are professional teams that play together all the time. Our closest link is 12 mans - and I'd like to remind you that I've said several times now that 12 mans totally should have a win/loss ELO. Amusingly, 12 mans currently have NO ELO, instead.

In fact let's expand on that example: Are you saying that ELO in a professional sports setting would still be valid if every single match each team was randomly assembled from different skill and physical capabilities, including a ten year old kid, 65 year old man, 4 people who play soccer "on the weekends sometimes", one guy who never played soccer in his life, and a few people who played on regular teams. Would it still be "on the shoulders" of the 3-4 serious players left in the match? Oh yeah, the other team happened to pull in a group of 4 from the best Soccer team in Europe. Would this be fair or good for ANYONE?


I am absolutely saying that Elo in a professional sports setting would be absolutely 100% accurate in the conditions you listed. In fact it would be easier to identify as the impact of professional sports players on teams otherwise comprised of children, old people and neophytes would be more pronounced.

Elo does track what the player controls. In fact that's what it tracks most precisely and without precondition -

Elo tracks how likely and even how much a given player helps his team win. More than any other factor that's exactly what a player controls when they play the game.

View Postdrunkblackstar, on 09 November 2013 - 07:35 PM, said:

Can I quote myself?
World Football League uses ELO because the teams are not PUGs! Can You imagine that the team members of, lets say Real or Manchester United, will be picked by chance every game? For example, Ronaldu will be swapped for some school boy that have equal win/loses ratio as Ronaldu in his backyard? :)

It's funny how ppl try to comapre PUGs to professional teams. The skill level of the professional teams is stable, even more stable then the skill level of the individuals. The PUGs are random ppl playing based on simple average. Can You feel the difference?


So let me ask you this - didn't you just prove my point? If you took the whole Manchester United team and scattered them among every grade-school team across the *planet*, wouldn't you be able to statistically look at the games those teams played and see that when Ronaldu played with some London middle school they won more often than when he didn't? In fact If you had him play 100 games with that schools football team and compared those 100 games against their previous 100, or the 100 before that, couldn't you tell how much his being there helped them? If you then compared their 100 games against a rival and compared how often then won with Ronaldu on the team than without, wouldn't that be pretty clear?

What you guys are absolutely missing is that Elo is *more* accurate when pugging, not *less*. That's why playing on a premade skews your results - coordination has a potentially exponential benefit to a players performance, that's why 4mans win more often than pugging on average. It magnifies each individual players ability based on how that 4man interacts. Are they on coms? Do they sync builds? Are they competitive? It gives that 4man a large set of variables that the pug players don't have. Thus they're playing with a slightly different set of variables.

When pugging all your variables are pretty much the same as every other pugs variables. Even the premades; you're as likely to drop with as against premades, their presence on your team is as likely as their presence on the other team.

Hence why I do think premade and pug Elo needs separated. Elo for pugging though is completely accurate. It may need ~500 matches in the same weight class to really shake out but it's about as accurate as anything can be for matching in this sort of environment. Any other metric is subject to manipulation.

So, to reiterate.

If you took the best football (soccer for us Yanks) players in the world and scattered them into random teams - in fact if you then made those random teams out of people picked at random from society who didn't even play football, they absolutely would increase the odds of their teams winning by dint of their skill. Conversely a team with an 85 year old cripple would play at a disadvantage and be less likely to win. If you scrambled those teams every single game the teams with skilled players would be more likely to win and the teams with disabled or incapable people would be more likely to lose. You could, in fact, look at 500 matches Ronaldu played and see that his teams (random every match) won more ofthen than the 85 year old cripple, whos teams would have won statistically less than average.

That's why Elo works for this. It's actually even more complicated but given that this simple point, literally the fundamental mathematical basis for statistics and statistical sampling, the whole idea that STATISTICAL MATHEMATICS EXIST AND WORK, isn't accepted by some people needs settled before you can get into things like k-factor.

View Postdrunkblackstar, on 09 November 2013 - 07:51 PM, said:

For those who believe that ELO works great: plz, go read the forum. The latest topics (I took them only from the first page of Game Balance):

1) DID I MENTION MATCHMAKING IS STILL FREQUENTLY TERRIBLE?

2) LOSING STREAK

3) I LOSE NEARLY EVERYGAME.. (FIX MATCHKMAKER PGI)

4) HOW DOES MATCHMAKER ACTUALLY WORK?

It seems that everything is OK, isn't it?


So... some people feel that they should win more than they do. Are you really trying to bring anecdotal confirmation bias opinion pieces in as evidence for saying why math doesn't actually work?

#74 Victor Morson

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 6,370 posts
  • LocationAnder's Moon

Posted 09 November 2013 - 09:20 PM

View PostMischiefSC, on 09 November 2013 - 07:56 PM, said:

STATISTICAL MATHEMATICS EXIST AND WORK,


Not how you think they do.

Again I remind you that the idea of ELO continues to be sound. But in solo pug environments, it should track individual performance and not team performance. If team performance is considered at all, it should make up the extreme minority of the equation used. Premade environments are a different issue.

Anyway, your wall of text simply keeps going on about how it will average out in the end. But what you're saying is the frequency of hitting terrible/great teams, or terrible/great tonnage mismatches in your favor are roughly going to boil down to 50:50 in the end, meaning it proves literally nothing.

Tracking a player's match score = great idea. But the actual win or loss is just plain dumb and no walls of text will be able to get around the fact that you are statistically measuring a random team and not the players on it, in the end.

It is true. 50% of the time I have 4 locusts and a trial spider or some {Scrap} and 50% of the time I have a premade of 4 Highlanders. Frankly my performance in either scenario is going to have minimal impact and the muddy statistics march onward.

Of course I am far less worried about my own ELO standing than the fact that this system effectively lets people who SHOULDN'T be ranked highly get stuck in the flow of success, which is bad for them for a whole bunch of reasons, including making new players suffer bad stomps and more experienced ones get thrown off by the inexperience. But in the end these people have the same ELO as hardened vets over time because there is literally no control over it.

Final result? ELO might as well not exist at all as it stands right now because every single match includes just as many notable vets as people who aren't even out of their cadet bonus. I don't understand how you can argue how ELO, as it is done here, works just because the concept works.

Edited by Victor Morson, 09 November 2013 - 09:27 PM.


#75 Deathlike

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • Littlest Helper
  • Littlest Helper
  • 29,240 posts
  • Location#NOToTaterBalance #BadBalanceOverlordIsBad

Posted 09 November 2013 - 10:12 PM

View PostVictor Morson, on 09 November 2013 - 09:20 PM, said:


Not how you think they do.

Again I remind you that the idea of ELO continues to be sound. But in solo pug environments, it should track individual performance and not team performance. If team performance is considered at all, it should make up the extreme minority of the equation used. Premade environments are a different issue.

Anyway, your wall of text simply keeps going on about how it will average out in the end. But what you're saying is the frequency of hitting terrible/great teams, or terrible/great tonnage mismatches in your favor are roughly going to boil down to 50:50 in the end, meaning it proves literally nothing.

Tracking a player's match score = great idea. But the actual win or loss is just plain dumb and no walls of text will be able to get around the fact that you are statistically measuring a random team and not the players on it, in the end.

It is true. 50% of the time I have 4 locusts and a trial spider or some {Scrap} and 50% of the time I have a premade of 4 Highlanders. Frankly my performance in either scenario is going to have minimal impact and the muddy statistics march onward.

Of course I am far less worried about my own ELO standing than the fact that this system effectively lets people who SHOULDN'T be ranked highly get stuck in the flow of success, which is bad for them for a whole bunch of reasons, including making new players suffer bad stomps and more experienced ones get thrown off by the inexperience. But in the end these people have the same ELO as hardened vets over time because there is literally no control over it.

Final result? ELO might as well not exist at all as it stands right now because every single match includes just as many notable vets as people who aren't even out of their cadet bonus. I don't understand how you can argue how ELO, as it is done here, works just because the concept works.


Victor, you're mixing the problems with the MM with ELO. ELO is not the problem... everyone has been saying the MM has the problems and is the primary culprit.

One of the fatal problems with the MM is how PGI has designed it... considering all ELO of all mechs in one weight class to be the same. A Jenner, the ultimate light, in the current state of the game is consider the same ELO bracket as the same as if you took a Locust, which is the worst/most inferior light mech in this game. Without having to explain why, that is one of many flawed assessments that the MM is trying to make.

Also, if you happen to have a high ELO, you are considered "very difficult to match". Once you hit that 2 minute mark wait (MM waits 3 minutes max total before failure), you are automatically shoved into any match because of "loosened constraints" such as tonnage. The team you are placed on can be subjected to having the lowest tonnage ever whereas your opponent has the highest tonnage ever... and vice versa. In part the problem stems from the "# of people in the queue" as not having enough skews the MM's "tryhard" nature and you get shoved into matches where you have little or no effect into a roflstomp that can go either way.

ELO isn't exactly built to combat this problem. However, IF any of the stuff you're supposed to do to succeed is still built-into your playstyle, then success can still happen despite whatever handicaps you. There is no sure guarantee of winning... but there are sure guaranteed ways of losing when you do everything wrong. It's the nature of the beast.

The assumption in the original argument is that "MM is perfectly working" or "doing its best", but let's be honest here... we're shoving it a completely borked/stacked deck that it doesn't truly know what to do with some of the time. ELO cannot correct the MM... the MM has to be a lot more intelligent instead of being "choosy" in all the wrong details. ELO itself cannot fix people who don't know how to "stand in the base" to the cap or "shoot the UAV" down so that the lurms can't blast you or your team. ELO is supposed to say "hey, I know how to consistently win, despite the odds". Even with the severely flawed state of the MM, it does its job...

Edited by Deathlike, 09 November 2013 - 10:14 PM.


#76 Victor Morson

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 6,370 posts
  • LocationAnder's Moon

Posted 09 November 2013 - 10:30 PM

If Matchmaker is to blame, Matchmaker at this point has rendered everyone's ELO score moot through months of bad matching anyway, though.

Edited by Victor Morson, 09 November 2013 - 10:30 PM.


#77 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 09 November 2013 - 10:33 PM

View PostVictor Morson, on 09 November 2013 - 09:20 PM, said:


Not how you think they do.

Again I remind you that the idea of ELO continues to be sound. But in solo pug environments, it should track individual performance and not team performance. If team performance is considered at all, it should make up the extreme minority of the equation used. Premade environments are a different issue.

Anyway, your wall of text simply keeps going on about how it will average out in the end. But what you're saying is the frequency of hitting terrible/great teams, or terrible/great tonnage mismatches in your favor are roughly going to boil down to 50:50 in the end, meaning it proves literally nothing.

Tracking a player's match score = great idea. But the actual win or loss is just plain dumb and no walls of text will be able to get around the fact that you are statistically measuring a random team and not the players on it, in the end.

It is true. 50% of the time I have 4 locusts and a trial spider or some {Scrap} and 50% of the time I have a premade of 4 Highlanders. Frankly my performance in either scenario is going to have minimal impact and the muddy statistics march onward.

Of course I am far less worried about my own ELO standing than the fact that this system effectively lets people who SHOULDN'T be ranked highly get stuck in the flow of success, which is bad for them for a whole bunch of reasons, including making new players suffer bad stomps and more experienced ones get thrown off by the inexperience. But in the end these people have the same ELO as hardened vets over time because there is literally no control over it.

Final result? ELO might as well not exist at all as it stands right now because every single match includes just as many notable vets as people who aren't even out of their cadet bonus. I don't understand how you can argue how ELO, as it is done here, works just because the concept works.


I've tried to express to you how you're incorrect in your assumptions about how Elo works in pugs and I get that either you don't understand or don't want to understand the results of that. So I'm going to show you why you're incorrect. This isn't about what I *feel* about it. This isn't an opinion. This is a specific field of mathematical science an the associated fields of psychology and sociology.

First, these are the behaviors people express that make them think that their anecdotal perceptions are correct in the face of statistical fact.

Apophenia - comes in a lot of forms but essentially it's a byproduct of our own confirmation bias and skewed by negativity bias to make us believe that in spite of any evidence to the contrary, statistical or otherwise, our own perceptions of 'what must be right' is correct and anything that disagrees with this is wrong. It is exactly why marketing research exists - peoples opinions are incredibly unreliable in determining actual facts of behavior. I absolutely believe that you believe that Elo *has to be wrong* because it doesn't reflect your memories of how often you win vs how often you lose and how that relates to your over-all record of wins and losses. The problem is that your perception of that is, because of how the human brain works, unreliable. That's why statistics are so important.

Statistics - this is why Elo is more viable for pugs than for premades. All pugs are equal statistically - a random sampling, slightly skewed for weight of mech in terms of balancing on a team. Individually premade teams can skew these results at an individual match basis but the probability of that is equally balanced among all pugs ergo it's statistically neutral as an aggregate. That means that over 500 games between two different pugs any variation due to premades will wash - be the same for both. That is why win/loss is important, more so than any other statistics. Same thing with variability in weight matching - yes, there are disparities in weight matching. They are nominal however and again equally applicable to every other player and as such wash when related to your actual win/loss performance.

Probability Theory - This is a big one. This is exactly why your Elo in random teams is relevant. More so than in organized teams. Your Elo in 12mans isn't actually as accurate as your Elo in pugging because in 12mans you're an integrated aspect of a larger equation. Now, suppose I took 100 matches in 12mans with the same team against another exact same team. It would certainly give me some specific data about you - the most precise of which would be how much you improve against that other team over those 100 matches. I could tell you how well you do in specific mechs against that team. It would all be contingent however in its validity to you being in that same team and playing against that exact same other team.

Where probability theory becomes useful in relating to pug matches is that it shows us that in the overall aggregate because you're dropping in the same 'random soup' that every other pug is dropping in it more effectively isolates your specific behavior from that of who you're dropping with. That's why mixing premade and pug Elo creates issues - premade results are skewed by a specific alternate set of data points that the other people in the match don't have tied to how well you coordinate with the people in your premade. It's beneficial but how much and what direction, as well as how the Elo band you drop in is skewed by their presence, that adds a layer of alternate data that pads the margin of error.

So, to sum up. This isn't a matter of debating opinion. It's not about perception or how either of us feels about it. There isn't a debate to be had over Elo working in pugs. Absolutely it works, if you play enough matches it'll accurately generate an appropriate score for you within a margin of error associated with how many matches you've played and a small adjustment for how much of that is premade time. The impact of premade play vs pug play is going to vary from player to player but will almost universally be nominal over a sufficient number of total combined matches.

Still, splitting pug and premade Elo will narrow the margin of error for calculating peoples Elo. it's still going to be pretty accurate however.

#78 Deathlike

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • Littlest Helper
  • Littlest Helper
  • 29,240 posts
  • Location#NOToTaterBalance #BadBalanceOverlordIsBad

Posted 09 November 2013 - 10:54 PM

View PostVictor Morson, on 09 November 2013 - 10:30 PM, said:

If Matchmaker is to blame, Matchmaker at this point has rendered everyone's ELO score moot through months of bad matching anyway, though.


No. I would say that ELO may have been slightly undervalued or overvalued depending on what your primary gameplay experience is... premades or solo. I tend to believe solo tends to stunt your ELO (at some point, you'll reach a point where you can't really improve it), and teamplay tends to overinflate it (especially with premades that work together often).

Given how this game works, actual good teamplay through premades tends to benefit, but on the other hand once you go solo, you are left in this void where the lack of coordination with teammates that you are familiar with puts you at a disadvantage... unless you are well versed in communication through in game chat.

Edited by Deathlike, 09 November 2013 - 10:55 PM.


#79 WVAnonymous

    Member

  • PipPipPipPipPipPipPipPip
  • Wrath
  • Wrath
  • 1,691 posts
  • LocationEvery world has a South Bay. That's where I am.

Posted 09 November 2013 - 11:44 PM

View PostWispsy, on 08 November 2013 - 05:16 PM, said:


So...I must be really lucky to go 70 wins in a row without my performance affecting the outcome. Then to do it again...the gods just love me? :)


Yes, the gods love you. I have that in real life, but not here. Better lucky than good.

#80 Victor Morson

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 6,370 posts
  • LocationAnder's Moon

Posted 09 November 2013 - 11:44 PM

View PostMischiefSC, on 09 November 2013 - 10:33 PM, said:

This is a specific field of mathematical science an the associated fields of psychology and sociology.


Which both you and the people who are programming ELO have outright butchered through a fundamental misunderstanding of WHICH stats your are using to figure out your averages.

Calculating individual PLAYER worth based on TEAM outcomes is a flaw in the very basis of everything you are saying, which renders all the rest of it entirely moot. You cannot gather clean data about individual player performance from a completely random team, with every single other random factor (such as tonnages) further muddying the water.

You aren't getting good input. All the math and links about how statistical analysis in the entire world can't change that fact.

Ultimately win/loss comes down to more how lucky you were in pulling an average solid team, or if you brought your own teammates into the match. Period. Even if things even out over time to a 1:1 via all the random elements are you are suggesting, this just means that players in every game will have an artificially high win and loss rate based on their random factors. End result?

Newbie McNewbieton will likely win 50% of his games due to random factors. If he is really bad and causes some losses by himself somehow, that will only influence things a TINY amount. He'll still be lumped in with other players that had a similar team experience.

Again, bad input data, bad output data. You can't statistically analyze things clearly when there is this much out of ANYONE'S control at play.

PS: They often do grab statistics from skewed sources like this in real life..... when they want to push a specific viewpoint. They will purposely pull "muddy data" that they can skew to show a different picture. Statistical analysis is one of the very, very easiest things to get wrong... or manipulate. And the way ELO is done here is very muddy indeed.


View PostWVAnonymous, on 09 November 2013 - 11:44 PM, said:

Yes, the gods love you. I have that in real life, but not here. Better lucky than good.


The vast majority of the stat he listed (without any firm evidence of the 60-70 figure really) was done in 4-mans, a whole other ballgame.

Edited by Victor Morson, 09 November 2013 - 11:50 PM.






1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users