Jump to content

Why I Can No Longer Stand Scouting, And It Makes Me Sad

Metagame

139 replies to this topic

#81 Xavori

    Member

  • PipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 792 posts

Posted 27 November 2017 - 07:24 AM

View Posttker 669, on 27 November 2017 - 06:25 AM, said:


You are absolutely not providing anything that proves your point.

You also can't articulate anything that even sounds remotely reasonable as to why the stats are meaningless. You have only provided an opinion which seems hinged around your theory that you are much better at the game than the stats portray. I have seen zero evidence that your position is anything other than your ego, unable to accept the numbers you have.

Argue and keep telling yourself that you're really good if only there was some magic stats showing how much sensor range you have added to your team....

Like it or not, match score, damage output, win/loss are absolutely important stats because killing mechs is the best way to consistently do well.


This is totally not about me.

It's about the fact that PSR is garbage. The matchmaker is garbage. Therefore, the leaderboard stats are garbage.

If you've seen zero evidence, it's because you're intentionally ignoring everything. MWO is a team based game. When evaluating individuals in a team based game, you cannot use W/L as a metric because an individual is not the entire team. In every other competitive event, this is a given. That a few players in MWO keep trying to hammer W/L as some kind of nail doesn't change reality. Math is not magic. Stats based on garbage data are garbage stats.

What's more, if you actually understood stats, you'd actually understand why I keep rolling my eyes every time Misch tries to pretend his math has meaning. You cannot magically create a number range called it "player skill" and then magically assign some meaning to it. Random numbers are random.

There are good players. There are bad players. You cannot identify them from the leaderboard. You cannot identify them from PSR. Therefore, the matchmaker creates random skilled teams which continues the whole random nature of wins and losses.

All you can pull from W/L record is that a particular player plays on good teams or bad teams. They might be the reason a team is good. They might be the reason a team is bad. BUT YOU CANNOT TELL THAT FROM THE STATS WE HAVE AVAILABLE.

On top of that, I have a strong suspicion we can't really come up with stats that are meaningful in a game played for fun rather than for meaningful competition.

For example, the last day of the Steiner loyalty event my unit switched to loyalist Steiner. It was lucky timing as we'd already been planning to switch to something IS after being Clan for a few months. This meant we could double-dip in the event, so of course that's what I did.

The event is pretty shallow. It's all about damage. And since I had less than a day to get 15 250 match score matches in, I went damage farming. I bought a Bushwacker P1, loaded it with streaks, and ~20 pug matches later was done. Yay second ECM Atlas for my goofy Atlas drop deck.

My stats on the BSH-P1 - 31 matches - 18 wins - 13 losses - 1.38 W/L - 27 kills - 18 deaths - 1.5 K/D and ~400 damage per match. Those would be good. I won more than I lost and killed more than died. And 400 damage per match in scouting is rock solid. Again, we're talking about a mech I've only owned for a day.

Those stats are very misleading tho.

See, the last 1/3 or so of my matches were dropping with unit mates instead of the pug drops I did while everyone else was sleepin'. And since I was done with the event, I was shooting at mechs until they had lost most of their armor and then switching targets. I was leading all the charges to get into range. I was doing my best not to kill enemies unless I was the only one left alive. In other words, I completely sabotaged my own stats.

I'm not unique in doing things like that. I'm not special. There are lots of MWO pilots who are very good who don't give a $$#!!! about leaderboard stats, and our stats reflect that.

But here's where Misch could be right in his talk about lots of matches...

If we did a deep dive into the data. If we had access to more than just the garbage leaderboard stats. We could in fact create a PSR that created better matches. It wouldn't be win/loss because again, that's just a measure of how often a player is on a good team. And if we had a real PSR, then maybe we could get a better matchmaker. And ultimately, we could get better matches.

p.s. And in such a unicorn, rainbow-infested world, everyone in solo drops would have a W/L of ~1.

#82 mistlynx4life

    Member

  • PipPipPipPipPipPip
  • 351 posts

Posted 27 November 2017 - 09:22 AM

Sorry OP.

In the context of not knowing anything about a match except that Player A is in it, the only reliable piece of data that can be used to determine if Player A is likely to achieve a team win is W/L ratio. K/D and such sounds good but you don't know that kills will definitely win the match - because a teammate might run around and cap all the points or something, faster, perhaps, than Player A can score a kill or even ever see the enemy. A W/L greater than 1.0 ratio says "When this player plays in a match, his team usually wins."

How he/they win/s is irrelevant. That's not within the context of that ratio to represent nor is it terribly helpful if you don't know anything else about the match beforehand. I can be the best NARC Raven pilot in the world, bring 2t of ammo and lots of speed. In a (QuickPlay) Skirmish where my teammates don't have LRMs/ATMs/Streaks, I've brought essentially wasted tonnage. My ability to make the build work is not the same as the likelihood that I will contribute to my team's victory. A NARC Raven without Missile Friends has sort of hampered his team's ability because his counterpart on the enemy's side probably didn't do the same thing. But maybe his counterpart is a Stealth Spider. NARCs could help negate that, right? So maybe it's not wasted tonnage - but you don't know until you start the match (and see the Spider) so you can't say that's your strategy and so can't make a judgement on the odds ahead of time based on anything (other than W/L), regardless of ability. That's just an example. If you can only pick one stat to best represent a player's ability to achieve a team win - whatever winning might mean in any given situation - you use W/L.

But of course I hasten to add that a lot of things about MWO are, indeed, gross and not helpful for much, lol.

#psrsayswhat

#83 Asym

    Member

  • PipPipPipPipPipPipPipPipPip
  • Nova Captain
  • 2,186 posts

Posted 27 November 2017 - 09:29 AM

View Postarcana75, on 27 November 2017 - 06:32 AM, said:

So with all the data available, what's the best gauge of a person's .... shall we say collective wisdom or cumulative skill, in MWO?

That players are still playing at > 100 matches a month ! Even those numbers are dwindling and I find myself only playing for events mostly......there is nothing else, is there? Any planets to fight over? Heck, as stated above entire teams switching sides and in so doing that, completely screwing up the entire event's statistics.... Alternate account players doing the same thing.... I'm not sure there is a Jade Falcon clan anymore as an entity? If so, name who is in charge of the Clan Falcon?

Gosh, it'd take a bazillion dollars just map out and figure out what are the "valid numbers" in MWO; and, that assumes you'd be able to figure out who was whom in alternative accounts and God knows what those 5 or 6 accounts really hold........

#84 Xiphias

    Member

  • PipPipPipPipPipPipPip
  • Littlest Helper
  • Littlest Helper
  • 867 posts

Posted 27 November 2017 - 10:13 AM

View Postarcana75, on 27 November 2017 - 06:32 AM, said:

So with all the data available, what's the best gauge of a person's .... shall we say collective wisdom or cumulative skill, in MWO?

Really depends on how you define skill. If we assume only solo play (teams tend to skew things a bit) then WLR is the best indicator of a player's ability to win matches if you consider winning the most important thing then WLR is the most important skill. One problem with this is that modes like scouting can be won by simply gathering intel without fighting. You will end up with a better WLR, but arguably have less "skill" than a player that focus on kills.

After WLR it gets harder to establish what "skill" is. What determines a good pilot? Is it mechanical skills? Ability to work on a team? A player can be good at 1v1s, but be bad at positioning and therefore not good in 12v12s.

The two best indicators that we have available to us besides WLR are probably kills per match (my personal pick) and match score. Sure, both can be farmed to an extent, but high kills per match usually shows a solid contribution since most matches are won by kills.

Really the best gauge is to look at a persons aggregate stats. If a person has a high match score but low KDR there's a good chance they are doing inefficient damage (LRMs, SSRMs, inaccurate lasers, etc.). If a person has a high KDR but low match score or WLR they probably are hiding to stay alive and are farming KDR instead of contributing more. If a person has a high WLR and maybe KDR but low matchscore there is a good chance they are dropping in a group a lot.

While individual stats can be farmed you can't farm all your stats at the same time without being a decent player. Running away to preserve KDR will probably drop your matchscore. Look at a persons stats and see if they are consistently high across the board. From a stats standpoint that's probably the best you're going to get as a measure of "skill".

Tarogato has a good spreadsheet that breaks down where a player stands percentage wise based on their leaderboard stats. Just compare each of a player's stats to where they sit percentage wise and that gives a good approximation of their "skill". Keep in mind that it's most useful when comparing players within the same PSR rating since playing at a higher PSR will generally mean harder competition.

For example, looking at your stats I would guess that you recently got into T3 either this season or the end of the last season. Looking at match score you go from being a new player in season 14 ~90%, to a beginner player in season 15 and 16 ~25%. Your stats drop a bit in season 17 (closer to 50%) which I would attribute to moving up in tier and playing against harder players.

#85 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 27 November 2017 - 03:37 PM

View PostXavori, on 27 November 2017 - 04:18 AM, said:


No. Wins has no meaning. I'm always on great teams? A WINNER IS ME!

That's meaningless.

And no, I didn't deduce all other stats have meaning. I've kinda been pretty consistent in calling every leaderboard stat garbage. They don't give you any idea of whether or not a player is a good mech pilot or not. And I'm definitely including W/L in that.



No. My W/L after 100 matches is an utterly meaningless thing. It has zero relation to my skill as a pilot. Nada. Zilch. None.

You are absolutely the guy with a hammer who thinks the world is nothing but nails. I'm the guy with a zamboni who thinks the entire world is cartoon characters covered in chocolate sauce doing the macarena while singing out tech manuals as lyrics to old show tunes.

In chess, you can look at ELO and have an idea about the probability of the two players winning or losing. That's because chess only has individuals, and they're both playing with exactly the same pieces under the same set of conditions. The game itself isolates everything that is not the individual playing.

In MWO, you can't do that. You cannot say at the start of a match if a player is going to win or lose with any realistic probability. You have to start isolating other variables before you even begin to get hints. Which mech is said player running? Is said player on a premade team? Is the player on a solid internet connection? What time of day is the person playing at? What are the skill of his 11 teammates. Which mechs are his teammates running? And so on. And even then, your probabilities are going to still be in the area of 50/50. So, meaningless when it comes to pilot skill.

I do believe there are good pilots and bad pilots. What I don't believe is that you can look at anything on the current leaderboard and tell me who they are. All you can tell me is which pilots play on good teams more often than not.


So I want to be clear on this.

You believe that the only reason some players have a better win/loss than others is because they play in group queue.

That's it.

Otherwise, everyones win/loss is random.

If that is the case then please show me the people who have a win/loss that swings from season to season from 10 to 0.1, randomly.

In fact please show me 10 players with over 100 matches in each season who, over any 4 seasons, swing from 10.0 to 0.1 win/loss.

In fact please show me any data from the leaderboard showing any players individual metrics with over 100 matches swinging by more than a handful of percentage points every single season. Not one person with usually stable stats who has one single atypical month - I mean wildly swinging stats.

Having looked at the leaderboard a lot I already know this isn't the case.

You're wrong. I get that you don't want to be wrong but your opinion is irrelevant.

This isn't a matter of someone with a hammer seeing everything as a nail. This is a matter of a math question about statistical analysis, which is to say exactly why your win/loss is accurate for pug queue, being solved by statistical analysis.

You've made it very clear you don't understand the math involved or statistics or anything like it. Not hard to identify because, again, the how and they why of this is absolutely basic statistics.

So for now please provide me with proof of the random changes in peoples win/loss season to season.

View PostAsym, on 27 November 2017 - 05:59 AM, said:

Way off topic and another jab as if the leaderboard has any value... Xavori has it right and ,many on this forum throw "leaderboard grenades" to justify their beliefs: for good or ill. I'll not judge you.

Factually, my stats are a result of the environment first and foremost and then, skill. If I drop in FP on a pick-up-team and get farmed versus a Scouting match with peers.... I 've had 3 kill-900 damage scouting matches and <1000 dmg FP matches... I had ZERO control of the environment and skill made little or any difference.... Some players do control their environments by only playing where winning is a given (Ah, those teams that farm noobs we encounter each and everyday...)

But, let's see, according to the leaderboard (you can hear the choir music in the background), I've played 300+ games to your 100+ games...... Hmmmm? You need to get hopping to catch up with me !!! Geeze Marie Lad, only 100+ games...........(sigh...what is the world coming to?)

Scouting is still a mess OP and I lament having to talk about statistics vis-a'-vis positive enhancements we'd like to see....


Please provide data from the leaderboard that shows a random swing in the stats of everyone who doesn't constantly play in group queue from season to season.

Examples. Should be easy; only a tiny fraction of the population plays in group queue and even then not all the time.

Please show us all the actual proof of your theory.

Given that if that were true it would invalidate statistical analysis and literally mean math, basic, fundamental math, as we know it doesn't actually work I'm going to guess that this is going to result in 2 things happening:

1. You'll either not look, because you know deep down you're wrong but really don't want to be, or you'll go look, see that actually people stick within a range or if they change it's in the same direction along a curve and nobody anywhere is randomly bouncing up and down, showing that you're absolutely and completely wrong.

2. You'll ignore all the evidence showing you're wrong, refuse to actually use the magical Googles to pick up even some rudimentary basics on statistics and math and just stick to being wrong and come back here with repeated statements of opinion as though it's faction.

At no point am I or anyone else making logical arguments over why W/L works like it does stating opinions. We're just pointing out how the math works. That's it. You guys not liking what the reality of that means is irrelevant to the absolute mathematical reality of it.

You are 8.33% of your team in every game you play. Your impact on your teams ability to win is reflected in how often you win. Every single other player in QP is also 8.33% of their team. They are playing in the same environment with the same pugs as you. Your win/loss is a reflection of how well you help your team win or lose. Be that because you sometimes intentionally bring bad mechs or you're bad in certain situations or you have other good habits or whatever. Every single thing you do affects your teams ability to win every match. Every single match you play is only against the other players in the same pool. Win/loss is Zero Sum. If you add all the matches of every single player together, wins and losses, the result is 0. That means that your relative impact on your team is inherently relative to that of every other player. So someone who plays better than you will help their team win more often than you and as such their win/loss will be better than yours.

Good luck you two. I look forward to seeing the proof you guys have that math is a lie and everything is just totally random and 100% unpredictable and nobody has any impact or influence on anything. I promise that when I use your incredible revelation to write a paper destroying the cornerstone mathematical principles of almost every STEM field and become famous as the guy who proved that it is, actually, impossible to solve for a single constant in a formula (12 v 12) with a binary result and in so doing changed the future of mathematics forever.

Well, not so much changed the future of it but proved that math is, actually, impossible - that if a variable is even slightly random (and the results in a 12 v 12 are absolutely not even close to random - they fall in an incredibly narrow range of results from a math perspective) then you can never, ever, in any way, identify the relative range of that single consistent variable.

I realize this has turned a little rambling and maybe a bit mocking but at this point it's hard not to. The ability to solve for what a single variable is among multiple other less known variables is exactly why statistics and analysis, statistical method, regression analysis, all these tons of fields exist. It's the basis of the math behind fields from chemistry and particle physics to all the math and analytics of services like Google and Facebook.

If your win/loss is lower than you want, you need to get better. Because it's pretty accurate.

View Postarcana75, on 27 November 2017 - 06:32 AM, said:

So with all the data available, what's the best gauge of a person's .... shall we say collective wisdom or cumulative skill, in MWO?


There's not a metric for that.

Win/loss you can solve for because it's just measuring how well you do at helping your team win. That's it. There's too many variables in that to say that it represents your wisdom or 'skill'. It's a strong indicator, sure. So would a very high match score and high KDR be. Kills per match would probably be better than KDR.

However there's not an indicator for player skill or wisdom.

Win/loss is just that. How good you are at helping drive wins.

#86 mistlynx4life

    Member

  • PipPipPipPipPipPip
  • 351 posts

Posted 27 November 2017 - 03:58 PM

View Postarcana75, on 27 November 2017 - 06:32 AM, said:

So with all the data available, what's the best gauge of a person's .... shall we say collective wisdom or cumulative skill, in MWO?

Honestly? Without attaching a single specific value to what 'skill' actually represents, I think it's dropcalling in QuickPlay. That's how I measure a player's skill or whatever. Knowing maps, knowing tactics, knowing different 'mechs well enough to call targets and assign priorities and such - and in a way that teammates respect and listen to you instead of just rage-quitting or Lone Wolfing - that's how I'd say someone has reached the pinnacle of MWO skill. Maybe that's just me. When someone steps up to the mic and leads a team to victory, or even a good defeat, it always earns more respect from me than what they're driving or how many kills they have.

#87 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 27 November 2017 - 04:00 PM

Found a useful link, because I'm just flat out unwilling to explain basic math on a forum.

This is an info page on Microsofts TrueSkill matchmaking system. It includes how they manage rankings and score for people in multiplayer games with teams on each side and such. The example set goes up to 8 v 8 but it's pretty clear. Understand they drill down on your individual ranking tight enough to ladder-rank the players, where as we need nothing even remotely that precise here.

It's still based 100% off your win/loss.

Because anyone, at all, who understands what math is and how it works is going to understand why. That's not intended as a slight but as a motivator for people who erroneously think that they are powerless in the face of events in which they are a direct participant to realize that no, actually, in the end everything you do moves the needle - even just a bit - and over time that movement is measurable, quantifiable, relevant and impactful.

Edited by MischiefSC, 27 November 2017 - 04:01 PM.


#88 Xavori

    Member

  • PipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 792 posts

Posted 27 November 2017 - 11:42 PM

View PostMischiefSC, on 27 November 2017 - 03:37 PM, said:


So I want to be clear on this.

You believe that the only reason some players have a better win/loss than others is because they play in group queue.


No.

I believe that people with a better win/loss record play on better teams more often that people who don't. Queue never even entered into it.

Win/Loss in a team game reflects the team, not the individual. You keep trying to say otherwise, and I keep saying no. I thought I was pretty clear on that.

Trying to read anything more into W/L than that is voodoo. You cannot claim that over time a good player will skew the results in their favor because you simply don't have enough information to make any kinds of claims about how many repetitions it'd take to see a trend. You've said a few times 100 games. I'm certain that number was an anally extracted conclusion.

You don't know how many players there are who solo queue in MWO. You don't know what the range of player skill is. You don't know how you'd even assign a value to player skill. You don't know how many good players there are vs casaul/average players vs bad players. You don't know what impact each possible action a pilot can have on win/loss. You are just throwing out random numbers. And random numbers are random no matter how many repetitions you have.

Oh, and Microsoft's TrueSkill is NOT A GOOD EXAMPLE for MWO. MWO never gets away from team matches. MWO doesn't do respawns which means that games snowball readily and that the amount of luck that goes into win/loss is greatly increased. MWO doesn't have anywhere near a large enough playerbase that you'll get random samples. And so on and so on.

Not to mention when you say "It's still based 100% off your win/loss" NO IT'S NOT!

TrueSkill rates individual competition differently than team competition. It has to. It also considers draws, not just W/L. It considers quality of matches. It's not just W/L because just W/L wouldn't work for team based games. Now, it is ultimately heading towards trying to create a probability that a particular player will win or lose a match, but it doesn't ignore the environment that produces wins and losses to come up with that probability.

p.s. This is a much better explanation of TrueSkill than your link:
http://www.moserware...20TrueSkill.pdf

#89 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 28 November 2017 - 12:39 AM

View PostXavori, on 27 November 2017 - 11:42 PM, said:


No.

I believe that people with a better win/loss record play on better teams more often that people who don't. Queue never even entered into it.

Win/Loss in a team game reflects the team, not the individual. You keep trying to say otherwise, and I keep saying no. I thought I was pretty clear on that.

Trying to read anything more into W/L than that is voodoo. You cannot claim that over time a good player will skew the results in their favor because you simply don't have enough information to make any kinds of claims about how many repetitions it'd take to see a trend. You've said a few times 100 games. I'm certain that number was an anally extracted conclusion.

You don't know how many players there are who solo queue in MWO. You don't know what the range of player skill is. You don't know how you'd even assign a value to player skill. You don't know how many good players there are vs casaul/average players vs bad players. You don't know what impact each possible action a pilot can have on win/loss. You are just throwing out random numbers. And random numbers are random no matter how many repetitions you have.

Oh, and Microsoft's TrueSkill is NOT A GOOD EXAMPLE for MWO. MWO never gets away from team matches. MWO doesn't do respawns which means that games snowball readily and that the amount of luck that goes into win/loss is greatly increased. MWO doesn't have anywhere near a large enough playerbase that you'll get random samples. And so on and so on.

Not to mention when you say "It's still based 100% off your win/loss" NO IT'S NOT!

TrueSkill rates individual competition differently than team competition. It has to. It also considers draws, not just W/L. It considers quality of matches. It's not just W/L because just W/L wouldn't work for team based games. Now, it is ultimately heading towards trying to create a probability that a particular player will win or lose a match, but it doesn't ignore the environment that produces wins and losses to come up with that probability.

p.s. This is a much better explanation of TrueSkill than your link:
http://www.moserware...20TrueSkill.pdf


So, I want to be clear here. You believe that everyone who plays in pug queue and still has a good win/loss has a good win/loss because they... play on a better random team?

You're contradicting yourself. Admittedly it's not possible to do otherwise but you've really hit yourself on the crux of the issue. Are you saying that everyone plays a majority of their games in group queue while grouped up?

If your skill doesn't impact win/loss for your team and your win/loss is random, then everyones win/loss would be random for QP. Since only a fraction of the games population plays while grouped in group queue that means their win/loss would be random and as such show a random distribution from season to season.

It doesn't. Anywhere. Which you keep ignoring.

Did you read the link you posted on TrueSkill? You just posted a link to a very detailed explanation of why you're wrong.

TrueSkill is based 100% on win/loss. How do you think it builds its data for predicting which team will win? Off their win/loss record. That's exactly what I've been saying. You use someones win/loss to generate a rank and then every time they win you adjust their rank up and down when they lose by a value depending on the rank of who they beat. You do this for them as a team or in a solo game.

What TrueSkill does that seems to be confusing you is it tracks two factors for each player - their estimated rank and then its confidence in its own estimation of their rank. It uses a very conservative value for establishing rank and then always starts with the assumption it's under-valuing the player.

Go read it. It goes into the player skill curve, it explains why it uses a Gaussian distribution instead of a the Logistic one Elo usually uses while pointing out that the function is identical, the reason they use the Gaussian one is to stretch out values to rank players on a ladder instead of just fit them within set deviations.

It even gives a good entry level explanation of Bayesian probability and inference. Essentially that's how you update and refine the value you assign to a player based on their continuing performance. In reference to your own win/loss in pug queue it's relative to why your win/loss gets more accurate as you get more samples. Read pages 14 to 16 in the link you posted. It might really help you understand where you're failing in this. To help you here's some translations from math nerd to english for what you'll be reading:

Posterior = the result. In the context of win/loss, your new w/l score after the match you just won or lost.
Prior = your previous results. The win/loss you had before.

TrueSkill goes on to then use your new win/loss and who you won/lost against to estimate your probability of winning against other players. Essentially it's more than just tracking your win/loss, it's using those results to rank you in a ladder against other players and then establish a value for you (like an Elo score) to then judge your ability to win against people of another set score or rank. That's not to try and build a balanced match - it's to see what it needs to adjust your score by (if at all) if you win matches it predicts you would lose or lose matches it predicts you would win.

TrueSkill is a matchmaker - its purpose is to try and rank players in as accurate a ladder ranking as possible across various platforms of games. What it's talking about is using your win/loss to then identify where you rank relative to every other player in skill, and then every time you win or lose make sure it keeps adjusting everyones value to get as accurate as possible. The more matches you play, the more accurate it is.

It works from 1 v 1 up to 100 v 100 or whatever. Number of players on each team is irrelevant, save in that the more players on each team the more matches required to build a high confidence. So it takes 93 matches of 8 v 8 for them to be almost certain of where they ladder rank you. Since all MWO needs to do is fit people into distributions we don't need nearly as many matches. We don't need to get exact, just pretty close.

Also you provided none of the examples from the leaderboard of peoples win/loss swinging wildly one season to the next, since it's random.

Still waiting for that.

For now, read what you linked. It's got some great details explaining exactly why you're wrong.

#90 Xavori

    Member

  • PipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 792 posts

Posted 28 November 2017 - 01:17 AM

Misch,

I think I've figured out your problem. You're a statistics engineer, not a statistician. You know how to use the software. You know what the results mean after you get them, but you have no idea how any of it was made in the first place.

When you say I contradict myself...where? How?

A player with a good win/loss record has played on good teams more often than bad teams. Now, he might very well be the kei piece that skews the team into the good category, BUT YOU CANNOT JUST ASSUME THAT. And you cannot tell me how many games a player has to play before you will know that because you don't know too many other things.

That's what I'm saying. But more than that, there is no current way to apply meaning to any leaderboard stat because the matchmaker isn't making quality matches. PSR is not player skill rating. It's an XP bar that goes down sometimes a small amount versus going up large amounts. It wouldn't even work if it was accurately moving because 5 skill tiers isn't precise enough to create quality matches.

So we get random matches. And random matches produce random results. And random numbers are random.

As for TrueSkill...

TrueSkill is not just win/loss. If you cannot follow the math, maybe take some more math classes from your local community college.

It cares about draws. It cares about team vs individual. It cares about quality of matches (ie. a comp team vs a pug stomp has virtually no impact on TrueSkill change as a result of that match). It cares about unbalanced matches (one team has more players than the other). And so on. It's not just win loss. If it was just win/loss, it'd look a lot like chess ELO, even though ELO is also not just win/loss.

But again, I don't think you understand the math, which is why our discussion isn't getting anywhere.

You cannot use 100 matches to make a claim about anything in MWO unless you first know the player base, possible skill ranges of players, proportions of good players to bad, and so on and then just happen to get lucky in that works out to be a statistically relevant sample size (very doubtful). Stats aren't that easy. The fact that you pulled that number out of thin air should have clued me in a lot sooner that you don't actually understand the underlying principles even if you do understand how to apply the answers your software gives back to you (ie. the difference between a mathematician and an engineer).

So to sum up:
if you disagree that quality of match matters, you're wrong.
If you think PGI's matchmaker is producing quality matches that give meaningful data back, you're wrong.
If you think 100 is a statistically relevant sample size with no data to back up that claim, you're wrong.
If you think ELO is just win/loss, you're wrong.
If you think TrueSkill is just win/loss, you're very, very wrong.

#91 Xiphias

    Member

  • PipPipPipPipPipPipPip
  • Littlest Helper
  • Littlest Helper
  • 867 posts

Posted 28 November 2017 - 07:12 AM

View PostXavori, on 27 November 2017 - 04:18 AM, said:

I'm the guy with a zamboni who thinks the entire world is cartoon characters covered in chocolate sauce doing the macarena while singing out tech manuals as lyrics to old show tunes.

This explains a lot.

#92 Asym

    Member

  • PipPipPipPipPipPipPipPipPip
  • Nova Captain
  • 2,186 posts

Posted 28 November 2017 - 08:17 AM

View PostXavori, on 28 November 2017 - 01:17 AM, said:

Misch,

I think I've figured out your problem. You're a statistics engineer, not a statistician. You know how to use the software. You know what the results mean after you get them, but you have no idea how any of it was made in the first place.

When you say I contradict myself...where? How?

A player with a good win/loss record has played on good teams more often than bad teams. Now, he might very well be the kei piece that skews the team into the good category, BUT YOU CANNOT JUST ASSUME THAT. And you cannot tell me how many games a player has to play before you will know that because you don't know too many other things.

That's what I'm saying. But more than that, there is no current way to apply meaning to any leaderboard stat because the matchmaker isn't making quality matches. PSR is not player skill rating. It's an XP bar that goes down sometimes a small amount versus going up large amounts. It wouldn't even work if it was accurately moving because 5 skill tiers isn't precise enough to create quality matches.

So we get random matches. And random matches produce random results. And random numbers are random.

As for TrueSkill...

TrueSkill is not just win/loss. If you cannot follow the math, maybe take some more math classes from your local community college.

It cares about draws. It cares about team vs individual. It cares about quality of matches (ie. a comp team vs a pug stomp has virtually no impact on TrueSkill change as a result of that match). It cares about unbalanced matches (one team has more players than the other). And so on. It's not just win loss. If it was just win/loss, it'd look a lot like chess ELO, even though ELO is also not just win/loss.

But again, I don't think you understand the math, which is why our discussion isn't getting anywhere.

You cannot use 100 matches to make a claim about anything in MWO unless you first know the player base, possible skill ranges of players, proportions of good players to bad, and so on and then just happen to get lucky in that works out to be a statistically relevant sample size (very doubtful). Stats aren't that easy. The fact that you pulled that number out of thin air should have clued me in a lot sooner that you don't actually understand the underlying principles even if you do understand how to apply the answers your software gives back to you (ie. the difference between a mathematician and an engineer).

So to sum up:
if you disagree that quality of match matters, you're wrong.
If you think PGI's matchmaker is producing quality matches that give meaningful data back, you're wrong.
If you think 100 is a statistically relevant sample size with no data to back up that claim, you're wrong.
If you think ELO is just win/loss, you're wrong.
If you think TrueSkill is just win/loss, you're very, very wrong.

Target, cease fire !!

I had a response earlier but deleted it. I even had examples but, what's the point.... Everything isn't a simple binary decision. Just take the concept of "mass" (like principles of war Mass) or the word "regulated" and ask yourself:

If one team has 45 tons of Armor and 20% more combine firepower over the opposite team, on a map that favors the attackers, with the defending team composed of 30% less "skill" levels, do you or can you expect an outcome? Yes. Now, if the lighter, weaker and less skilled team actually works together and stomps the heavily favored team, in your system that is "just a win"..... Seriously? What a sterile and skewed metric.

No, the weaker team should be rewarded for an accomplishment and each player's cut should be based on the "mass" that player brought to the fight and the "damage" that player contributed to. That way, a light pilot with no kills but with say, 1,100 AMS missles destroyed seriously participated and the ECM/NARC/TAG pilot did as well... On the other hand, the losing team Assault pilots get slapped harder because they had the upper hand; and, they had the "mass, or as some say the 'regulated combat power'" and failed to use it to their expected and predicted advantage..... Their entire team is judged by what they brought to the fight and what they did or did not do. A Tier 1 pilot on the losing team driving a meta Assault that died with 100 damage would get blasted statistically for the loss while the little kit fox pilot on the same losing team that pulled off two kills and 500 damage would still get a loss but, heavily rewarded for the accomplishment even in the loss......

The Wins and Losses have graduated values. Like in real life. Math has little to do with this other than calculating who contributed what against what was "mathematically" expected (budget to actual like EVMS)....... By the way, this is a centuries old concept that is in use today; whether you know it or not (and, unless you have access to modern militaries, you'd never see it at work...). A Well Regulated Militia....in the US 2d Ammendment means something you should really study because it's a historic reference to the system I just described......it was how combat was and is conducted to this day....all warfare is regulated and what is a win or a loss is only decided by history even if the face value says one thing or another....

Food for thought.

#93 Xavori

    Member

  • PipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 792 posts

Posted 28 November 2017 - 09:05 AM

View PostAsym, on 28 November 2017 - 08:17 AM, said:

The Wins and Losses have graduated values. Like in real life. Math has little to do with this other than calculating who contributed what against what was "mathematically" expected (budget to actual like EVMS)....... By the way, this is a centuries old concept that is in use today; whether you know it or not (and, unless you have access to modern militaries, you'd never see it at work...). A Well Regulated Militia....in the US 2d Ammendment means something you should really study because it's a historic reference to the system I just described......it was how combat was and is conducted to this day....all warfare is regulated and what is a win or a loss is only decided by history even if the face value says one thing or another....

Food for thought.


Your first sentence is really important. It's actually built into TrueSkill. It's also kinda in chess's ELO, but with caveats (ie. the skill level between the two players is assumed to never be beyond the standard deviation). Chess ELO also assumes normal distribution of player skill level which is why I shot it down the first time Misch brought it up because you cannot make that assumption about MWO pilot skill.

A good team that stomps a bad team doesn't see much, if any, change to their rating in TrueSkill. But in MWO leaderboards? Ya, a win is a win. Yay team? TrueSkill also provides matchmaking data that can usually prevent horribly skill mismatched teams, but MWO's matchmaker? ROFL.

The other problem Misch has with stats, and why I keep trying to hammer home random numbers are random is that you cannot, absolutely cannot, do any kind of statistical analysis on unknown ranges, and you most certainly can't just pull the number 100 out of your posterior and say that's enough.

For example, if you flip a coin 100 times, ya, you're likely going to get even numbers of heads and tails, or close enough that you have an idea what the probabilities are. Heck, you can take the numbers 1-10, pick one at random 100 times, and analyze your results and get some meaningful data. But let's say you're picking numbers between 1-10,000. Taking 100 of those at random is just random numbers. And random numbers are random.

And MWO isn't even doing that much for us. If the matchmaker worked, most players would have a W/L ration near 1 because it would be putting equivalent skilled players in equal numbers on both teams. But since it doesn't do that, we have random matches that produce random results.

Heck, PGI could implement TrueSkill and we'd be closer to meaningful (it still has problems in that a good player in a bad mech is hard to identify for matchmaking) matches. And at least then, you could find some number of matches (many more than 100 given the sheer number of variables that go into winning a match) that started to produce a meaningful indication of player skill versus other players.

But if wishes were fishes....

#94 ANOM O MECH

    Member

  • PipPipPipPipPipPipPip
  • Survivor
  • 993 posts

Posted 28 November 2017 - 10:12 AM

View PostXavori, on 27 November 2017 - 07:24 AM, said:


This is totally not about me.

It's about the fact that PSR is garbage. The matchmaker is garbage. Therefore, the leaderboard stats are garbage.

If you've seen zero evidence, it's because you're intentionally ignoring everything. MWO is a team based game. When evaluating individuals in a team based game, you cannot use W/L as a metric because an individual is not the entire team. In every other competitive event, this is a given. That a few players in MWO keep trying to hammer W/L as some kind of nail doesn't change reality. Math is not magic. Stats based on garbage data are garbage stats.

What's more, if you actually understood stats, you'd actually understand why I keep rolling my eyes every time Misch tries to pretend his math has meaning. You cannot magically create a number range called it "player skill" and then magically assign some meaning to it. Random numbers are random.

There are good players. There are bad players. You cannot identify them from the leaderboard. You cannot identify them from PSR. Therefore, the matchmaker creates random skilled teams which continues the whole random nature of wins and losses.

All you can pull from W/L record is that a particular player plays on good teams or bad teams. They might be the reason a team is good. They might be the reason a team is bad. BUT YOU CANNOT TELL THAT FROM THE STATS WE HAVE AVAILABLE.

On top of that, I have a strong suspicion we can't really come up with stats that are meaningful in a game played for fun rather than for meaningful competition.

For example, the last day of the Steiner loyalty event my unit switched to loyalist Steiner. It was lucky timing as we'd already been planning to switch to something IS after being Clan for a few months. This meant we could double-dip in the event, so of course that's what I did.

The event is pretty shallow. It's all about damage. And since I had less than a day to get 15 250 match score matches in, I went damage farming. I bought a Bushwacker P1, loaded it with streaks, and ~20 pug matches later was done. Yay second ECM Atlas for my goofy Atlas drop deck.

My stats on the BSH-P1 - 31 matches - 18 wins - 13 losses - 1.38 W/L - 27 kills - 18 deaths - 1.5 K/D and ~400 damage per match. Those would be good. I won more than I lost and killed more than died. And 400 damage per match in scouting is rock solid. Again, we're talking about a mech I've only owned for a day.

Those stats are very misleading tho.

See, the last 1/3 or so of my matches were dropping with unit mates instead of the pug drops I did while everyone else was sleepin'. And since I was done with the event, I was shooting at mechs until they had lost most of their armor and then switching targets. I was leading all the charges to get into range. I was doing my best not to kill enemies unless I was the only one left alive. In other words, I completely sabotaged my own stats.

I'm not unique in doing things like that. I'm not special. There are lots of MWO pilots who are very good who don't give a $$#!!! about leaderboard stats, and our stats reflect that.

But here's where Misch could be right in his talk about lots of matches...

If we did a deep dive into the data. If we had access to more than just the garbage leaderboard stats. We could in fact create a PSR that created better matches. It wouldn't be win/loss because again, that's just a measure of how often a player is on a good team. And if we had a real PSR, then maybe we could get a better matchmaker. And ultimately, we could get better matches.

p.s. And in such a unicorn, rainbow-infested world, everyone in solo drops would have a W/L of ~1.


Good players find a way to win more games than they lose. New, inexperienced, very casual and bad players however will lose more. Part of this is because they will at times be a big part of the loss. Many things a bad player can do or not do such as being one of the last mechs on the field and more often than not, they miss the clutch shot. These are statements that are observational and experience based, so are my opinion. Would be hard to say however that these opinions are completely out to lunch. Not rocket science.

Now what you fail or just outright refuse to grasp is that a good players ability to pull off the win or have a significant impact on a match, will be shown in their wins and losses. You can argue senselessly against the above statement all you like. Your position is just not logical or reasonable.

#95 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 28 November 2017 - 10:50 AM

I'm not an engineer. In fact my job is 99.99999% social. Almost all the analytics I do anymore are handled by software. However anyone who's completed high school math and algebra should be able to understand this topic.

I keep noticing how nobody is putting up examples from the leaderboard of random swings in win/loss from season to season.

TrueSkill only cares about win/loss. Win/loss includes ties, because ties are a loss for both sides. The only way it cares about who is on your side and the other side is in calculating how much to adjust your score based on your wins and losses. TrueSkill, Elo, (ELO is Electric Light Orchestra. Elo is a last name, the name of the guy who created the Elo system. Arpad Elo. If you had ever even looked up what Elo is you would know that as well) they all are based on win/loss.

They are matchmakers. They are designed to take win/loss and create a ranking system to match players. They all use win/loss as their base and then give you a rank based on it. Then they adjust your rank based on who you lost to and who was on your team and the other team. Their only use of anything but win/loss is in determining how much to adjust your score by.

Since you guys literally put up the link to TrueSkill but then clearly didn't read it, going to make this a lot more simple:

So you have two people. One guy, Bob, texts while he drives, pays little attention to what's going on around him. He runs red lights and drives a beaten up old car with no seat belt or safety features.

Another guy, Fred, is a very safe driver. Pays attention, drives carefully, drives a new car with all the best safety features.

They both have a 20 mile commute through the same traffic 5 days a week.

Is one of them more likely to get in an accident than the other? In an accident is one more likely to be seriously injured?

Why?

They can only control themselves and the vehicle they drive. They don't control the weather or other drivers. Matches in MWO are a tiny, tiny fraction as complicated as the variables in that equation. You're only in a 12 v 12 environment, work commutes have a variable number of variables -

yet you can directly impact the odds of you getting in an accident and avoiding injury by how well you drive and what you drive.

Does that make more sense? I know you guys want to pretend otherwise but this isn't hard.

Also, again. Still waiting for statistics from leaderboard showing random shifts in performance month to month.

Edited to add:

Please link me where I've said, anywhere or in any other post I've made, that the current matchmaker delivers quality matches. I have not - specifically because it's not based on win/loss, it's just an XP bar.

You guys have literally posted a link that directly discusses everything I've said and have absolutely, clearly not read it.

Please show me where in the TrueSkill system it uses anything but win/loss to derive your score. It modifies your score based on who was on your team and who you played against - which is what Elo does as well. However the only thing that triggers a change in your score is if you win or lost and the only metric it draws upon to create your score is win/loss.

Edited by MischiefSC, 28 November 2017 - 10:57 AM.


#96 Xavori

    Member

  • PipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 792 posts

Posted 28 November 2017 - 10:52 AM

View Posttker 669, on 28 November 2017 - 10:12 AM, said:


Good players find a way to win more games than they lose. New, inexperienced, very casual and bad players however will lose more. Part of this is because they will at times be a big part of the loss. Many things a bad player can do or not do such as being one of the last mechs on the field and more often than not, they miss the clutch shot. These are statements that are observational and experience based, so are my opinion. Would be hard to say however that these opinions are completely out to lunch. Not rocket science.

Now what you fail or just outright refuse to grasp is that a good players ability to pull off the win or have a significant impact on a match, will be shown in their wins and losses. You can argue senselessly against the above statement all you like. Your position is just not logical or reasonable.


My position is absolutely rational.

Everything I've said in this thread is based on reality and understanding of numbers.

Your first sentence would be true...except a good player can be totally offset by a number of bad players. This is a team game. This is a highly interdependent team game. The greatest MWO pilot cannot carry a team of bad players. He has to have at least some help. And a great pilot with below average teammates will more often than not lose to a team of just slightly above average opponents.

And this is the single biggest point I keep making. It's the foundation for everything else I've said on the subject. The quality of players in a match is random. Because it's a highly interdependent team game, when you take a bunch of random players, you get random results. And random numbers are random. You cannot make them un-random by taking them a bunch of times. You cannot make them un-random by applying some special math formulas. Random numbers are random.

Now, let's pretend that the matchmaker actually put equivalent skilled teams together. My opinion on the leaderboard stats would change completely. Kills and damage would still be garbage, but at least win/loss would potentially have some meaning over time, like hundreds of matches over time.

Of course, win/loss wouldn't be the statistic that you'd be looking at to see if a player is good because with a good matchmaker, everyone's win/loss ratio will approach 1. Instead, you'd be looking at a rating that was based on the quality of opponents that a particular player wins against over a great number of matches until eventually that player hits equivalent skilled players and starts losing and winning at roughly the same rate. This is the basis for every other individual rating system in competitive game play.

So right now, the leaderboard stats are garbage. Because matches are random. And random matches produce random results. And random numbers are random.

#97 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 28 November 2017 - 11:00 AM

View PostXavori, on 28 November 2017 - 10:52 AM, said:


My position is absolutely rational.

Everything I've said in this thread is based on reality and understanding of numbers.

Your first sentence would be true...except a good player can be totally offset by a number of bad players. This is a team game. This is a highly interdependent team game. The greatest MWO pilot cannot carry a team of bad players. He has to have at least some help. And a great pilot with below average teammates will more often than not lose to a team of just slightly above average opponents.

And this is the single biggest point I keep making. It's the foundation for everything else I've said on the subject. The quality of players in a match is random. Because it's a highly interdependent team game, when you take a bunch of random players, you get random results. And random numbers are random. You cannot make them un-random by taking them a bunch of times. You cannot make them un-random by applying some special math formulas. Random numbers are random.

Now, let's pretend that the matchmaker actually put equivalent skilled teams together. My opinion on the leaderboard stats would change completely. Kills and damage would still be garbage, but at least win/loss would potentially have some meaning over time, like hundreds of matches over time.

Of course, win/loss wouldn't be the statistic that you'd be looking at to see if a player is good because with a good matchmaker, everyone's win/loss ratio will approach 1. Instead, you'd be looking at a rating that was based on the quality of opponents that a particular player wins against over a great number of matches until eventually that player hits equivalent skilled players and starts losing and winning at roughly the same rate. This is the basis for every other individual rating system in competitive game play.

So right now, the leaderboard stats are garbage. Because matches are random. And random matches produce random results. And random numbers are random.


Please provide leaderboard examples of random shifts in win/loss in players. Random would involve swings of more than a couple of percent. Like 10.0 to 0.01 one season to the next.

Almost everyone plays almost all their matches in the leaderboard in QP, so this should be easy. It should be visible on almost every player on all 40k listed names.

So please give examples of clear random distribution.

#98 Xavori

    Member

  • PipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 792 posts

Posted 28 November 2017 - 11:03 AM

View PostMischiefSC, on 28 November 2017 - 10:50 AM, said:

TrueSkill only cares about win/loss. Win/loss includes ties, because ties are a loss for both sides. The only way it cares about who is on your side and the other side is in calculating how much to adjust your score based on your wins and losses. TrueSkill, Elo, (ELO is Electric Light Orchestra. Elo is a last name, the name of the guy who created the Elo system. Arpad Elo. If you had ever even looked up what Elo is you would know that as well) they all are based on win/loss.


The first sentence is wrong.

TrueSkill uses wins and losses as a starting point only. It absolutely builds in quality of matches. Your rating goes up more when you beat skilled opponents than when you beat poor ones. It absolutely builds in unbalanced teams. It considers ties not as wins or losses, but as a factor in determining how much a win is worth in that particular game. If a game produces a lot of ties, but you manage to win, that's worth more to your rating. And there is a lot more built into it as well. TrueSkill then produces a rating that is used both for leaderboards and matchmaking because TrueSkill kinda relies on quality matches to produce its ratings in the first place.

You so obviously don't understand TrueSkill I'm not sure why you brought it up.

You also keep referring to looking at MWO's leaderboard W/L ratio as something magical. Here's the thing tho, even if we had a good matchmaker that produced quality matches, W/L ration would not have anything to do with the skill of a pilot. W/L ratio would be ~1, for everyone, at every skill.

As for ELO...duh. Not only am I familiar with who made it, I can actually follow the math on how he did it. I'm also familiar with the tweaks to his original work that pretty much every chess federation has made in their never ending quest to get equal rated players to a 50% chance to win against a player with the exact same rating. Oh, and that last bit is also why in chess you don't look at W/L, but instead at the ELO rating. W/L ratio is a meaningless statistic in a format where the goal is to get players facing off against players of the same skill.

If MWO had such a matchmaker, we prolly wouldn't be having this discussion because I would never have said the leaderboard stats are garbage. The only stat people would care about is a real PSR (or TrueSkill or ELO) because none of the rest of it would matter outside of looking to see who is an efficient killer maybe (high kills - low damage) or who seems to be able to take down mechs by themselves (solo kills) most often. But nobody would look at W/L ratio. Because it'd be around 1.

#99 Jay Leon Hart

    Member

  • PipPipPipPipPipPipPipPipPip
  • The Spear
  • The Spear
  • 4,669 posts

Posted 28 November 2017 - 11:04 AM

View PostXavori, on 28 November 2017 - 10:52 AM, said:

And this is the single biggest point I keep making. It's the foundation for everything else I've said on the subject. The quality of players in a match is random. Because it's a highly interdependent team game, when you take a bunch of random players, you get random results. And random numbers are random. You cannot make them un-random by taking them a bunch of times. You cannot make them un-random by applying some special math formulas. Random numbers are random.

So right now, the leaderboard stats are garbage. Because matches are random. And random matches produce random results. And random numbers are random.

You keep saying this, but even the slightest dive into leaderboard stats proves that, actually, they aren't random. There are definite trends.

Prove they are random, or stop with the obvious lie. Your choice.

#100 Xavori

    Member

  • PipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 792 posts

Posted 28 November 2017 - 11:08 AM

View PostMischiefSC, on 28 November 2017 - 11:00 AM, said:


Please provide leaderboard examples of random shifts in win/loss in players. Random would involve swings of more than a couple of percent. Like 10.0 to 0.01 one season to the next.

Almost everyone plays almost all their matches in the leaderboard in QP, so this should be easy. It should be visible on almost every player on all 40k listed names.

So please give examples of clear random distribution.


When I keep telling you the leaderboard is garbage, why do you keep trying to insist that I use the leaderboard as part of my argument? I'm not going to because the leaderboard is garbage. The only thing you can pull from looking at the leaderboard is which players care about which stats and which players either get lucky teams more often or play with good teammates more often.

View PostJay Leon Hart, on 28 November 2017 - 11:04 AM, said:

You keep saying this, but even the slightest dive into leaderboard stats proves that, actually, they aren't random. There are definite trends.

Prove they are random, or stop with the obvious lie. Your choice.


I don't have to prove the leaderboard is random. You already know the matchmaker is random. Nobody argues that PSR is not a meaningless rating, and since the matchmaker barely uses that anyway, you get random matches. I mean, you're not seriously suggesting I'm wrong about this?

So if the matches are random, how then do you think you get not random results?





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users