Jump to content

Psr Update And Hold On Patch.


713 replies to this topic

#601 D U N E

    Member

  • PipPipPipPipPip
  • Bad Company
  • Bad Company
  • 131 posts

Posted 14 June 2020 - 01:51 PM

View PostZerex, on 14 June 2020 - 01:24 PM, said:


If i am understanding you right you have my point totally wrong, i am saying you can't use Average MS, WLR or KDR as you have just pointed out they give false positives and negatives

Top performance on a loss

Bottom performance on a win

Dying in a win

Surviving a loss

Scoring a low MS in a low tonnage game

Scoring a high MS in a high tonnage

All these things are false data points, a robust system has to either over come them through a formula or it has to score players where these aren't false data sets


I have not read the full argument, I was countering a set point. Though if this is the new point:
Yes, a system is going to have many false positives - this is not a 1v1 game, and we are plagued by being a part of a team that may or may not be competent. Lets look at players with the highest MS but a negative W/L ratio. These players CAN be good players, just they are not playing to the current meta. HOWEVER - If nascar meta were to die, and games started becoming more intelligent and longer bouts of trading/pushes, etc. These players would be in a very different environment which could be the meta.

So with this said, what are your opinions on the Jay Z system?

That system currently solves the issue of "Low MS in a low tonnage game" and vice versa. While weighing W/L rather importantly, and also seems to solve if you were with a, less than sub par team but still performed admirably. If you have not had a proper look at it, I advise you try it out and adjust the values to see what various games maybe like.

I agree, currently you can't use Avg. MS, WLR or KDR to distinguish a player fully. That said, MS is the defining value of if a player is good - at least, that is what the system looks into. So for a fix we need to look at MS and how it works with W/L. KDR is unimportant value overall I would argue. To fix MS we need to look at ways of generating MS which TBF, the most important aspects are likely damage, kills and if you lost/won. KDR is useless, but the three stages of kills: Kill, Kill Most Damage Done, and Solo Kill (Which the former are two awarded as well) is a important area to look at and adjust.

That said, going into what should be awarded and how is currently out of my depth since I have no idea of the current values PGI are using. I know AMS is currently awarding too much MS - which constitutes to a higher AMS. I was able to get a MS of 800 from 200 damage due to my AMS.

#602 Gagis

    Member

  • PipPipPipPipPipPipPipPip
  • FP Veteran - Beta 1
  • FP Veteran - Beta 1
  • 1,731 posts

Posted 14 June 2020 - 01:53 PM

View PostZerex, on 14 June 2020 - 01:24 PM, said:


If i am understanding you right you have my point totally wrong, i am saying you can't use Average MS, WLR or KDR as you have just pointed out they give false positives and negatives

Top performance on a loss

Bottom performance on a win

Dying in a win

Surviving a loss

Scoring a low MS in a low tonnage game

Scoring a high MS in a high tonnage

All these things are false data points, a robust system has to either over come them through a formula or it has to score players where these aren't false data sets

The trick here is that we are working for the goal of having an accurate matchmaker. A matchmaker relies on statistics to predict how likely you or your team are to win against a certain set of opponents. In statistics, all these outlying match results even out over time, because all players are equally likely to encounter them, and half of those equally lucky or unlucky players are on the opposing team.

There are games that are 90% skill and 10% luck. There are games that are 90% luck and 10% skill. Statistics, such as recording your wins and losses, work equally well for both and equally well for MWO, which lies somewhere in between. Luck is not magic. Luck is equal for all and adds up to 0 over time. This is why it makes no difference for matchmaking if half of your games come down to luck and half come down to your actual skill. The WLR is just as accurate at predicting your chances either way. It would work for matchmaking just fine even if MWO was 90% luck and 10% skill.

Random is random.

I still added match score to my proposal, but that's more for the sake of making players happy with their experience that for matchmaking. I expect my proposal would be less accurate at predicting your success than a pure WLR model. That's fine as long as the difference is such that if for example you'd hypothetically need 25 matches behind you for matchmaker to know where to put you with a WLR system, maybe you'd need 50 matches with my system instead. That's still small enough of a number I'd be happy with it.

If it turns out the numbers are 50 and 500 instead, I'm going to be rather embarrassed.

#603 David Sumner

    Member

  • PipPipPipPipPipPip
  • FP Veteran - Beta 1
  • 470 posts
  • LocationAuckland, New Zealand

Posted 14 June 2020 - 01:53 PM

View PostBrauer, on 12 June 2020 - 07:57 AM, said:


Repair and rearm will only punish players who are bad at the game by increasing the grind for them or making them perpetually cbill poor. It doesn't do anything to change the fact that the best way to win is still to take your opponents off the board.

Also, no idea why you are proposing mismatched team sizes with 4-8 people having to take on 12 for three missions in a row. You realize this is a PVP game right and that you need people to sign up to play on both teams in a match?



Have you ever looked at the breakdown? Typically heavies and assaults make up probably 60-80 percent of of all mechs dropping at any given time. Lights often make up 10-15% of all mechs dropping.


I wasn't suggesting mismatched teams. That would only make sense with asymmetrical objectives at best.
I was indicating differences in behaviour induced by environment. In particular why "everything is skirmish".

As for weight classes?
Right now I'm looking at:
A: 33
H: 24
M: 22
L: 23

And that's about typical for what I see.
Maybe it's time of day dependent?

#604 D U N E

    Member

  • PipPipPipPipPip
  • Bad Company
  • Bad Company
  • 131 posts

Posted 14 June 2020 - 01:56 PM

View PostZerex, on 14 June 2020 - 01:50 PM, said:



Ok to use the example you have given, season 47, mediums arranged by W/L

Posted Image

These 2 players W/R are very close and both very high, Nightbirds system will rate these 2 players at the same PSR

Reading between the lines at the stats, Gurney Hallack is playing far outside skill zone, my guess is they move towards the enemy by them selves or just charge enemy line every time they see them. or it could be an assault being left behind by their team, or they have very little heat management and blow themselves up when the fighting starts, its hard to tel how they play.

1 thing is clear Gurney Hallack doesn't have the same impact on the teams outcome as Ghogiel does, but Nightbirds system will still rate them at the same PSR and treat them as equal skilled players


What page is Nightbirds system, I have been mainly vouching for the system proposed by Jay Z. I will say though, it is a surprise that an outlier such as that exists. My only guess is they are still in tier 5 and are being boosted by GQ with friends, since on Jarls their average W/L is much lower than this before the change.

#605 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 14 June 2020 - 01:57 PM

View PostZerex, on 14 June 2020 - 01:50 PM, said:



Ok to use the example you have given, season 47, mediums arranged by W/L

Posted Image

These 2 players W/R are very close and both very high, Nightbirds system will rate these 2 players at the same PSR

Reading between the lines at the stats, Gurney Hallack is playing far outside skill zone, my guess is they move towards the enemy by them selves or just charge enemy line every time they see them. or it could be an assault being left behind by their team, or they have very little heat management and blow themselves up when the fighting starts, its hard to tel how they play.

1 thing is clear Gurney Hallack doesn't have the same impact on the teams outcome as Ghogiel does, but Nightbirds system will still rate them at the same PSR and treat them as equal skilled players


WLR isn't 100% accurate, it is 37% accurate. avg MS is 11% accurate. PGI's Tier/Jay Z's system is 5% accurate.

Edited by Nightbird, 14 June 2020 - 01:58 PM.


#606 Zerex

    Member

  • PipPipPipPipPipPip
  • The Tip of the Spear
  • The Tip of the Spear
  • 298 posts
  • LocationUK

Posted 14 June 2020 - 01:59 PM

View PostD U N E, on 14 June 2020 - 01:51 PM, said:


So with this said, what are your opinions on the Jay Z system?



As i have said on Reddit, Jay Z's system will work, it will have flaws in it with false positives being a problem, if those false positives and other flaws account for 1%, then that is amazing.

If its closer to 20% you can see a fifth of the input data is corrupt or misleading

#607 D U N E

    Member

  • PipPipPipPipPip
  • Bad Company
  • Bad Company
  • 131 posts

Posted 14 June 2020 - 02:05 PM

Fair enough then, I thought you were arguing that the system should not be in place. So therefore must have taken your point out of context.

#608 Zerex

    Member

  • PipPipPipPipPipPip
  • The Tip of the Spear
  • The Tip of the Spear
  • 298 posts
  • LocationUK

Posted 14 June 2020 - 02:05 PM

View PostD U N E, on 14 June 2020 - 01:56 PM, said:

What page is Nightbirds system


https://mwomercs.com...mmunity-psr-mm/

#609 D U N E

    Member

  • PipPipPipPipPip
  • Bad Company
  • Bad Company
  • 131 posts

Posted 14 June 2020 - 02:11 PM

Oh yea, that won't work. It's like communism, in a perfect world it would be a perfect system, the truth is, this is a flawed environment and therefore needs a system to limit it's own flaws.

#610 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 14 June 2020 - 02:15 PM

View PostD U N E, on 14 June 2020 - 02:11 PM, said:

Oh yea, that won't work. It's like communism, in a perfect world it would be a perfect system, the truth is, this is a flawed environment and therefore needs a system to limit it's own flaws.


LOL It's the same system that decides if the medicine you take works or not, so next time remember to trash the communist medicine :)

#611 East Indy

    Member

  • PipPipPipPipPipPipPipPip
  • The Hammer
  • The Hammer
  • 1,240 posts
  • LocationPacifica Training School, waiting for BakPhar shares to rise

Posted 14 June 2020 - 02:16 PM

View PostNightbird, on 14 June 2020 - 11:51 AM, said:


What's more common, players with hundreds of games and 10 WLR not being close in skill, or players with hundreds of games and the same avgMS and not being close in skill?

https://leaderboard....&d=DESC&page=10

One page with a 5 point difference in avgMS and WLR from 1.2 to 6. avgMS sucks.

6.00 WLR was from somebody who played 14 games. Small sample sizes are what stink.

Win-loss variances just as easily suggest 1) stats have never differentiated between solo and group queue, 2) PSR has provided inaccurate matchmaking data for years, and 3) all but the very best players in solo queue are at the mercy of a coin toss because teams are so poorly balanced. There are too many factors for your conclusion not to possibly have it backwards.

That said, in that slice you have ~30% outliers — most of them attached to very well-known names — and the rest center pretty well on 1.5 WLR. And here's the thing: we also have observation to supplement data theories. Do you play on a different account? If not, it's been awhile. Before each match these days, I look people up on Jarl's List. I also see a lot of these players over and over again through a session. Players with matchscores averaging 370ish over the course of an average of 3,000 games are not going to have bad games very often. They're just not going to. Contrariwise, players around 200-275 will be inconsistent and in bunches result in paralyzed teams; while anything lower corresponds to frequently poor performance. WLR theories rely on an invisible hand. Matchscore, with a few hundred matches recorded, doesn't lie.

Edited by East Indy, 14 June 2020 - 02:17 PM.


#612 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 14 June 2020 - 02:23 PM

View PostEast Indy, on 14 June 2020 - 02:16 PM, said:

6.00 WLR was from somebody who played 14 games. Small sample sizes are what stink.

Win-loss variances just as easily suggest 1) stats have never differentiated between solo and group queue, 2) PSR has provided inaccurate matchmaking data for years, and 3) all but the very best players in solo queue are at the mercy of a coin toss because teams are so poorly balanced. There are too many factors for your conclusion not to possibly have it backwards.

That said, in that slice you have ~30% outliers — most of them attached to very well-known names — and the rest center pretty well on 1.5 WLR. And here's the thing: we also have observation to supplement data theories. Do you play on a different account? If not, it's been awhile. Before each match these days, I look people up on Jarl's List. I also see a lot of these players over and over again through a session. Players with matchscores averaging 370ish over the course of an average of 3,000 games are not going to have bad games very often. They're just not going to. Contrariwise, players around 200-275 will be inconsistent and in bunches result in paralyzed teams; while anything lower corresponds to frequently poor performance. WLR theories rely on an invisible hand. Matchscore, with a few hundred matches recorded, doesn't lie.


I was replying to someone who uses examples with 12 matches. Take out the one outlier and you're still looking at 1.2-3WLR range for people with the same avg MS.

Again, avg MS does predict skill, it does it with 11% accuracy, WLR does it at 37%. If you don't believe me, use Jarl's data, graph past season avg MS and past WLR against future season WLR. Run an R-squared analysis.

Feelings lie, a calculator doesn't.

Edited by Nightbird, 14 June 2020 - 02:24 PM.


#613 D U N E

    Member

  • PipPipPipPipPip
  • Bad Company
  • Bad Company
  • 131 posts

Posted 14 June 2020 - 02:38 PM

View PostNightbird, on 14 June 2020 - 02:15 PM, said:

LOL It's the same system that decides if the medicine you take works or not, so next time remember to trash the communist medicine Posted Image


But medicine is easy to see if it works, it either works, or does not.
A player being good and sorted into the correct tier is different, there are levels of being good, and with vastly different skill levels competing against each other, its much harder to see if it will work. From my quick glance, it's is just W/L

So for arguments sake, we have T1-5 players. T1 players have in this case, only been fighting T1 players, and have a perfect 1-1 W/L ratio
T5 players have only been fighting T5 players and have a 1-1 W/L ratio.

So a T1 is worth the same as a T5 in W/L.

That means a entire team of T5 can face an entire team of T1. So all the T5 are now 1/2 W/L. The system sorts this and the next battle there are 6 T1 and 6 T5 on each team. Some T1 and some T5 go down and vice versa.

This means a T1 player can have a 2-1/1-1/1-2 ratio. a T5 can have a 1-1,1-2 or 1-3 W/L ratio. Now a argument could be made "That T1 players at a 2-1 ratio are better, the players at 1-1 are decent and at 1-2 are bad" but it removes how the T5 players now at 1-3 are extremely bad. So the system continues for a while and eventually it gets this sorted out. Raising/Lowering until you are in the correct tier.

Except it doesn't, because there will not always be the same percentage of good/bad players in a match. This means that some matches only have one T1 player and the rest T5, or 11 T1 with one T5 and etc, etc. This means that players W/L or PSR is completely luck dependent. How about if there is a group on the other side? Well now you go down, but that's OK, since you are now in a lower skill bracket you can go back up, but that means now you are going to be classed as a less skilled player meaning - You can be anomaly you can slowly break the system and allow some worse players to be classed as better, and better players to be classed as worst because the system counted 6 T1/6 T5 on A team, and 6 T1/6 T5 on B team, when in actuality there were 8 T1/ 4 T6 on A team.

Because skilled players don't play 24/7 the skill level fluctuates all the time. Solely using W/L ratio wouldn't work. It's not like we are saying "Good or Bad" we are looking at variations of good.

Once again, going back to medication, there is a set parameter that may fluctuate on success depending on person. They refine the medication until it works for the most people, then choose it depending on what has the most success for the most people.
The parameter for a medication working or not is a more gradual change, and it's competition also only take a gradual change.

A match makers aim should be to get the W/L of players to be 1/1 - Not for players to stomp a competition (Which is what medication does, it's aim is to stomp it's competition)

#614 OneTeamPlayer

    Member

  • PipPipPipPipPipPip
  • Ace Of Spades
  • 399 posts

Posted 14 June 2020 - 02:41 PM

I have a guaranteed solution that is 100% guaranteed to simplify the debate on how to correct for groups "carrying" a player and make PSR much simpler to implement.


Remove groups from the solo queue and put them back in their own 8v8 queue.

I am more than happy to authorize anyone to immediately begin using this idea, with no credit needed back to me.

This whole conversation is needlessly complicated because something that wasn't completely broken (solo queue) was needlessly fixed to cater to a small slice of the community (at max 15%) to the detriment of both solo and group queue to the point where we're having the conversation we're having now and most people are not pleased with the system at all.

Separated queues did not need to be "repaired" please unfix them and magically watch this self-inflicted problem dissipate significantly.

Were there stomps occasionally in old solo? Yeah, was it enough that it was a topic on everyone's tongue? No, not even close.

Is it 100% easier to balance a team around single players rather than around one team of 4 with 8 random individuals? Most decidedly so.


Merged queue was a mistake, it's causing nothing but problems for next to no benefit, let it go and start from the original separated queues that mostly worked (specifically after 8v8 group queue which players loved).

#615 yrrot

    Member

  • PipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 222 posts

Posted 14 June 2020 - 02:52 PM

*Current* MS, specifically, is only 11% accurate. Without removing Win/Loss bonus/penalty from current MS stats, they aren't really independent variables. A great player on the losing team would be predicted to be more likely to lose the next match if going directly by W/L. Since the match maker doesn't build teams to have even PSR and has to balance weight classes, you can't assume an even distribution of W/L at equal skill. W/L is only a better predictor right now because the system is broken.

Unlike trying to make W/L approach 1 like you would for a typical match maker, you want to make more balanced games. You can hit W/L = 1 by being on opposite sides of stomps, neither of which being fun. But you can have fun losing a close match.

If you wanted to make MS more reliable, you could look at the avg MS (without the win/loss bonuses) for winning a match and losing a match. Someone who performs better than the respective average is more likely to do so in future matches. the closer people are in those metrics, the more likely you'd see a balanced match.

#616 Zerex

    Member

  • PipPipPipPipPipPip
  • The Tip of the Spear
  • The Tip of the Spear
  • 298 posts
  • LocationUK

Posted 14 June 2020 - 02:53 PM

View PostD U N E, on 14 June 2020 - 02:05 PM, said:

Fair enough then, I thought you were arguing that the system should not be in place. So therefore must have taken your point out of context.


So i halfheartedly put forward a system 9 days ago on Reddit, it aimed to measure each player in a match on their personal performances in that match meaning averages don't skew the data pool.

As each match is different from the last or next in game mode, tonnages and maps you need a system that zeros all these factors

Each player skill level is determined by their skills and that is measured game after game after game on their ranking in the 24 players based off MS

MS needs some reworking to cut down on AMS and LRM MS farming, as well as other bits need tweaking, like maybe the win MS being increased to keep winning a positive part of the match and system.

The only real feed back i got on this system is it punishes light pilots and favors assaults, i disagree but i am open to the idea of using Jarl's list class multipliers to "balance" MS across the weight classes before the players are ranked 1-24.

As an example all players would start at 1000 PSR with a capped minimum of 0 PSR and a maximum of 2000 PSR

To form the bell curve for the player base tiers, the population would be broke up in to the 5 tiers and each tier would have a floating PSR marker based % size

for example: 20,000 players

Tier 1 = top 15% = 3,000 players

Tier 2 = 20% = 4,000 players

Tier 3 = 30% = 6,000 players

Tier 4 = 20% = 4,000 players

Tier 5 = bottom 15% = 3,000 players

PSR gain based on match data

Rank 1 = +12 PSR

Rank 2 = +10 PSR

Rank 3 = +8 PSR

Rank 4 = +6 PSR

Rank 5 = +4 PSR

Rank 6 = +2 PSR

Rank 7-18 = 0 PSR

Rank 19 = -2 PSR

Rank 20 = -4 PSR

Rank 21 = -6 PSR

Rank 22 = -8 PSR

Rank 23 = -10 PSR

Rank 24 = -12 PSR

NOTE all numbers are just examples, this is an outline of the system not a working set of variables.

The idea is if you perform well in a match you move up, if you perform poorly you move down, while winning makes up a part of the MS as an incentive to win it isn't a direct modifier to the PSR gain/loss

This system should work fine across all game modes (if match scoring is worked out quite fairly) and on matches where there was low tonnage or it finished early due to objective win conditions.

The reason i have dead zone in the middle is because you can't everyone pulling in both directions away from the baseline PSR else you see the problems seen in Xiphias's graph, also you want players to plateau at their skill level/PSR and stay there unless their player skill and performance increase or decrease.

TL:DR

perform well = +PSR

Perform poorly = - PSR

Perform as predicted by PSR/MM = no PSR change

Edited by Zerex, 14 June 2020 - 02:56 PM.


#617 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 14 June 2020 - 03:09 PM

View Postyrrot, on 14 June 2020 - 02:52 PM, said:

*Current* MS, specifically, is only 11% accurate. Without removing Win/Loss bonus/penalty from current MS stats, they aren't really independent variables. A great player on the losing team would be predicted to be more likely to lose the next match if going directly by W/L. Since the match maker doesn't build teams to have even PSR and has to balance weight classes, you can't assume an even distribution of W/L at equal skill. W/L is only a better predictor right now because the system is broken.

Unlike trying to make W/L approach 1 like you would for a typical match maker, you want to make more balanced games. You can hit W/L = 1 by being on opposite sides of stomps, neither of which being fun. But you can have fun losing a close match.

If you wanted to make MS more reliable, you could look at the avg MS (without the win/loss bonuses) for winning a match and losing a match. Someone who performs better than the respective average is more likely to do so in future matches. the closer people are in those metrics, the more likely you'd see a balanced match.


Average MS is 11% accurate from the R-Squared if you're quoting from me, and this number you can calculate directly from Jarl's data.

The *current MS* which feeds into the PSR is actually less than 5% accurate, same with Jay Z's system. I've been trying hard to explain this so you'll find different attempts posted all over this thread.

Here's a new way:

A lot of players have reach the Max of Tier 1 (PSR score=5000). Some are WLR=3 and avgMS=400, some are WLR=1.1 and avgMS=300. The Match Maker sees all Tier 1 players as equal (is blind), therefore it makes bad matches.

With Jay Z's system, a lot of players will still reach the Max of Tier 1. A lot of players will be sent to the bottom of Tier 5. However, within those clumps the MM also will also not be able to tell them apart. No avgMS, no WLR.

Edited by Nightbird, 14 June 2020 - 03:10 PM.


#618 yrrot

    Member

  • PipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 222 posts

Posted 14 June 2020 - 03:46 PM

To clarify, I mean the current average MS used to come up with 11% accuracy from R-squared is flawed because the data going in is bad. You'd have to remove the win bias and look at winning MS average and losing MS averages if you wanted marginally better data.

Edited by yrrot, 14 June 2020 - 03:46 PM.


#619 Zerex

    Member

  • PipPipPipPipPipPip
  • The Tip of the Spear
  • The Tip of the Spear
  • 298 posts
  • LocationUK

Posted 14 June 2020 - 03:59 PM

View PostNightbird, on 14 June 2020 - 03:09 PM, said:


Average MS is 11% accurate from the R-Squared if you're quoting from me, and this number you can calculate directly from Jarl's data.

The *current MS* which feeds into the PSR is actually less than 5% accurate, same with Jay Z's system. I've been trying hard to explain this so you'll find different attempts posted all over this thread.



Ok i just had a look at R-Squared, I have no idea where or how to even start working out my system using that, is there any chance you could work it out?

#620 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 14 June 2020 - 04:03 PM

View Postyrrot, on 14 June 2020 - 03:46 PM, said:

To clarify, I mean the current average MS used to come up with 11% accuracy from R-squared is flawed because the data going in is bad. You'd have to remove the win bias and look at winning MS average and losing MS averages if you wanted marginally better data.


You can easily remove win bias the data? You have the wins and losses. Take (avgMS * (W + L) - winMS*W - lossMS*L)/(W+L)

View PostZerex, on 14 June 2020 - 03:59 PM, said:

Ok i just had a look at R-Squared, I have no idea where or how to even start working out my system using that, is there any chance you could work it out?


I mean I calculated it to be .11 and .37 for avgMS and WLR respectively. If you want to check my math you have to do it on your end?

Edited by Nightbird, 14 June 2020 - 04:03 PM.






2 user(s) are reading this topic

0 members, 2 guests, 0 anonymous users