Jump to content

Showdown: Nightbird Vs Jayz Vs Cluster Fox - A Psr Comparison

Balance Gameplay General

87 replies to this topic

#1 Cluster Fox

    Member

  • PipPipPipPipPip
  • 104 posts
  • LocationStuck on a rock in Grim Plexus

Posted 08 August 2020 - 10:54 AM

So, up to Nightbird's challenge, I made a simulator. Trying to prove my PSR fixes, and validate NB's and JayZ's.

I built the simulator using the same logic as NightBird. To try and emulate all four different systems, and to provide common grounds for comparison. Simulator was adjusted and checked for accuracy against real player data. The objective was to pit models against one another.



Conclusion/Findings (TL;DR)
  • Stability is required.
  • JayZ's values, are inadequate. Stability will not fix it.
  • There is no free lunch (aka perfect solution)
  • Both NB's system and CFox tweaks are good solutions.
  • Cluster Fox with X,Y,C tuning, stability and seeding gives a good PSR representation of skill. This builds on the system already in place.
  • Nightbird's system is another viable option but has some shortcomings as well. With no moving average, PSR would need to be reset every season otherwise PSR will not evolve with skill. A moving average is a compromise. Seeding is worth exploring.



What are Cluster Fox's PSR fixes?
It's what we got now, but tuned up and stabilized.
  • Step 1. Change the X,Y,C values in the current system (jayZ).
  • Step 2. Add a simple formula using the player's PSR to stabilize it.
  • Step 3. Remove maximum PSR cap.
  • Step 4. Start new players at PSR=500.
  • No PSR reset.
No need to fetch player database for WLR - only PSR and match results are used.

Details in my original post: Tuning up JayZ and stabilizing it and detailed in google Docs

My propositions
This simulation proves that they have to be done together.
Posted Image

Posted Image



The systems simulated (for now)
  • Cluster Fox - My X,Y,C values in JayZ's system, with PSR stability
  • Cluster Fox + Seed - Same as above, but seeding Afactors and Pfactors
  • NightBird's (no MA) - His original WLR suggestion for the MM
  • NightBird's (100 and 300 MA) - Same with a moving average - I'm using NB's own formula for the Moving average
  • JayZ - The system currently implemented, with X,Y,C values initially suggested.
  • JayZ + Stability - The X,Y,C initially suggested, with Cluster Fox's PSR stability

    NOTE: JayZ explicitly mentioned the X,Y,C should be tuned as more data becomes available. This is more data
References at the end of this post.

In progress : X = 22, Y = 0, C = 1 with stability and seeding.

Building the simulator
Spoiler




Results
Let's skip to the interesting stuff.

Comparison 1 : W-L distribution
If the W-L distribution is narrow, the MM created more balanced matches. Players tend to get a WLR closer to 1.0 and both teams have a more equal chance of winning a game.
Posted Image

After 200k games:
NightBird's system (no MA), gives the best distribution of players in their W-L brackets.
Cluster Fox + Seeding is the second best system for W-L distribution.
NB (300 MA) is slightly worse than C Fox (Seed) - omitted for clarity.
Cluster Fox (no seed) and NightBird's system (100 MA) are comparable.
JayZ is by far the worst W-L distribution.
JayZ + Stability gives the about same results - omitted for clarity.

What does it mean?
In simple terms : The more you factor W-L in PSR, the closer the matches you get.
Also, once a moving average is included, PSR accuracy depends on noise and response time.



Comparison 2 : Team %W-L chance distribution per games
Ideally you want teams with 50%-50% win chance. An ideal MM would be able to create this every single game. This is not possible, so as close to 50-50 as possible is best.

All 4 base systems:
Posted Image

Again, NB (no MA) and C Fox (Seed) are best and this time, comparable. NB (no MA) is slightly better.

Effect of NightBird's Moving Average:
Posted Image
Including a moving average in NB's system has a negative effect. It makes everyone tend heavily towards 1.0 WLR at the start and as such, has problems from noise or slow response time of PSR. NB (100 MA) suffers from W-L noise. NB (300 MA) suffers mostly from slow response time (more games played would become more accurate).

Effect of Stability and seeding:
Posted Image

This one is interesting. Stabilizing JayZ has very little effect. Reason is not enough games were played at 200k to get the full divergence effect towards T1 and T5. This is visible later on in the PSR graphs.

Also visible is that seeding C Fox system drastically improves accuracy as this noise / response time trade off for 50 games pays off.

What does it mean?
Again, W-L is important, and seeding is also important with any kind of moving average to balance reponse time and noise.



Comparison 3 : PSR to skill. AFTER 200k games
Here, Skill is a measure of the natural trendline for player skill progression. To see how skill is estimated, please check out my other post. It's a mix of AvgMS and W-L.

A perfect PSR would display player distribution on a single PSR to Skill line and on a single PSR to W-L line. This is not possible, so getting as close as possible is best.
Posted Image
JayZ's system as initially suggested does not reflect skill, nor W-L.
AvgMS is too heavily weighted. Player movement is visibile as they move away from 2500 PSR at different rates.

NIGHTBIRD'S OPTION

Posted Image
Nighbird's option is calculated differently so the ideal W-L a different line, but a perfect match. However player skill is not as accurately represented.
The MM I simulated is imperfect and could not prevent a 33/1 WLR, this player's PSR got as high as 82000 !
Posted Image
Adding moving averages to NB's system is a compromise. Skill is better represented but W-L suffers from noise and response time.

ADDING CLUSTER FOX STABILITY TO JAY Z's SYSTEM
Posted Image
Adding CFox stability has a positive effect on player movement, effectively keeping some from T1 and T5. However effect on W-L is minimal because of the X,Y,C.

ADDING CLUSTER FOX X,Y,C TUNING TO THE ABOVE
Posted Image
Changing the X,Y,C has a tremendous impact on the PSR reflection of both skill and W-L. Now the biggest effect is noise and slow player movement.

FINAL CLUSTER FOX PROPOSITION
ADDING CLUSTER FOX SEEDING
Posted Image
Here, the effect of seeding is most visible on W-L. The initial fast movement followed by stability with the seeding has the best combined effect.

Errata : PSR vs Skill graphs are showing a X scale of 200 to 2000. The ideal line should intersect at (0,0) vice (200,0). This is not significant enough to change all the graphs.


Conclusion
  • JayZ's system, used as is, is inadequate. Stability will not fix it.
  • There is no free lunch (aka perfect solution)
  • Both NB's system and CFox tweaks are good solutions.
  • Cluster Fox with X,Y,C tuning, stability and seeding gives a good PSR representation of skill. This builds on the system already in place.
  • Nightbird's system is another viable option but has some shortcomings as well. With no moving average, PSR would need to be reset every season otherwise PSR will not evolve with skill. A moving average is a compromise. Seeding is worth exploring.



So, What PSR will I get under the different Systems?
Check it out here : Cluster Fox's PSR prediction



References
https://mwomercs.com/forums/topic/277351-tuning-up-jay-zs-psr-system-and-stabilizing-it/

https://mwomercs.com...cy-with-graphs/

Edited by Cluster Fox, 24 October 2020 - 01:10 PM.


#2 Gagis

    Member

  • PipPipPipPipPipPipPipPip
  • FP Veteran - Beta 1
  • FP Veteran - Beta 1
  • 1,731 posts

Posted 08 August 2020 - 11:04 AM

I am impressed by this.

#3 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 08 August 2020 - 11:18 AM

I'd only like to add that WLR is a very bad system. I only proposed it because it is the easiest to understand. Better systems require harder words even fewer people will be able understand.

A case of let pros do what pros do instead of sourcing novices.

Thanks for the hard work in doing all this Cluster Fox.

#4 Cluster Fox

    Member

  • PipPipPipPipPip
  • 104 posts
  • LocationStuck on a rock in Grim Plexus

Posted 08 August 2020 - 11:26 AM

View PostNightbird, on 08 August 2020 - 11:18 AM, said:

I'd only like to add that WLR is a very bad system. I only proposed it because it is the easiest to understand. Better systems require harder words even fewer people will be able understand.

A case of let pros do what pros do instead of sourcing novices.

Thanks for the hard work in doing all this Cluster Fox.

Simplicity is definitely a strength of your WLR system. One variable! The reason I'm simulating X = 22, Y = 0, C = 1 right now is that it's basically your WLR system expressed in (W-L) / games. Big advantage of being bound between -1 and 1.
I'm curious to see what comes out.

#5 Xiphias

    Member

  • PipPipPipPipPipPipPip
  • Littlest Helper
  • Littlest Helper
  • 862 posts

Posted 08 August 2020 - 01:43 PM

Great to see how these different approaches stack up using a more robust simulation. Thanks for putting in the hard work.

#6 Bowelhacker

    Member

  • PipPipPipPipPipPipPip
  • Hero of Marik
  • Hero of Marik
  • 922 posts
  • LocationKooken's Pleasure Pit

Posted 08 August 2020 - 03:25 PM

Is there a TL:DR version?

#7 John Bronco

    Member

  • PipPipPipPipPipPipPip
  • The Fighter
  • The Fighter
  • 966 posts

Posted 08 August 2020 - 03:28 PM

View PostBowelhacker, on 08 August 2020 - 03:25 PM, said:

Is there a TL:DR version?

It's literally at the bottom of the post.

#8 Cluster Fox

    Member

  • PipPipPipPipPip
  • 104 posts
  • LocationStuck on a rock in Grim Plexus

Posted 08 August 2020 - 03:34 PM

View PostBowelhacker, on 08 August 2020 - 03:25 PM, said:

Is there a TL:DR version?


I'll add it at the top too :)

#9 AizakG

    Member

  • PipPip
  • Little Helper
  • Little Helper
  • 29 posts

Posted 08 August 2020 - 03:34 PM

View PostBowelhacker, on 08 August 2020 - 03:25 PM, said:

Is there a TL:DR version?

yeah

#10 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 08 August 2020 - 03:56 PM

With 100 and 300 MA WLR, you can probably seed their historical WLR to avoid sluggish movement. Get the best of both worlds.

Edited by Nightbird, 08 August 2020 - 03:56 PM.


#11 Cluster Fox

    Member

  • PipPipPipPipPip
  • 104 posts
  • LocationStuck on a rock in Grim Plexus

Posted 09 August 2020 - 03:32 PM

Nightbird, I agree this would improve the results, so would it if I seed my own formula with the calculated stable PSR. I started all models at 2500 PSR since this is what happened with the reset. It makes the MM 100% blind at simulation start. I consider this part of the controlled environment for comparison.

on a different note:

Update !

For those interested : You may predict your PSR value under the current model and mine !
  • Go to Jarl's list.
  • Check out your AvgMS and WLR.
  • Plug them into this spreadsheet:

Edited by Cluster Fox, 09 August 2020 - 03:38 PM.


#12 Stonefalcon

    Member

  • PipPipPipPipPipPipPipPip
  • The Messenger
  • The Messenger
  • 1,375 posts
  • LocationProselytizing in the name of Our Lord and Savior the Annihilator

Posted 09 August 2020 - 03:59 PM

If this system is so good, where was it three months ago when PGI approached the community for help?

#13 BTGbullseye

    Member

  • PipPipPipPipPipPipPipPip
  • The Solitary
  • The Solitary
  • 1,540 posts
  • LocationI'm still pissed about ATMs having a minimum range.

Posted 09 August 2020 - 04:06 PM

View PostCluster Fox, on 09 August 2020 - 03:32 PM, said:

Nightbird, I agree this would improve the results, so would it if I seed my own formula with the calculated stable PSR. I started all models at 2500 PSR since this is what happened with the reset. It makes the MM 100% blind at simulation start. I consider this part of the controlled environment for comparison.

on a different note:

Update !

For those interested : You may predict your PSR value under the current model and mine !
  • Go to Jarl's list.
  • Check out your AvgMS and WLR.
  • Plug them into this spreadsheet:



According to this, I'll be Tier 1 in only 150 more games! (that's pretty close to what I predicted when I hit T2) Considering my stats last month were putting me in the 75th percentile, that's not really a good thing though. I do like the predicted stable PSR rating putting me mid-T2 though.

Edited by BTGbullseye, 09 August 2020 - 04:09 PM.


#14 Cluster Fox

    Member

  • PipPipPipPipPip
  • 104 posts
  • LocationStuck on a rock in Grim Plexus

Posted 09 August 2020 - 04:58 PM

View PostStonefalcon, on 09 August 2020 - 03:59 PM, said:

If this system is so good, where was it three months ago when PGI approached the community for help?


It didn't exist.

I made all this as a way to improve the accepted solution from JayZ - easily. Started working on it the day it was announced. But these things take time.

However, Nightbird had done all the work prior to this whole thing happening and submitted a system that would work - not adopted because reasons.

Edited by Cluster Fox, 09 August 2020 - 05:09 PM.


#15 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 09 August 2020 - 07:10 PM

View PostStonefalcon, on 09 August 2020 - 03:59 PM, said:

If this system is so good, where was it three months ago when PGI approached the community for help?


PGI and this community are a match made in heaven. Both score full 10s in ability to stick not only their heads but torsos into holes in the ground. It's quite a sight to see.

#16 V O L T R O N

    Member

  • PipPipPipPipPipPip
  • The People's Hero
  • The People
  • 318 posts
  • LocationThe Flat and Motionless Earth

Posted 09 August 2020 - 07:57 PM

“Today's scientists have substituted mathematics for experiments, and they wander off through equation after equation, and eventually build a structure which has no relation to reality. ” - TESLA

#17 Davegt27

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 7,020 posts
  • LocationCO

Posted 09 August 2020 - 08:39 PM

neat calculator cluster fox, thanks

#18 Nightbird

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • The God of Death
  • The God of Death
  • 7,518 posts

Posted 10 August 2020 - 06:15 AM

View PostV O L T R O N, on 09 August 2020 - 07:57 PM, said:

“Today's scientists have substituted mathematics for experiments, and they wander off through equation after equation, and eventually build a structure which has no relation to reality. ” - TESLA


Thanks for your input [redacted].

Experiments are still designed using mathematics. People that test the first ideas that pop into their head are doomed to accomplish nothing in a lifetime of experiments.

Edited by Ekson Valdez, 10 August 2020 - 10:08 PM.


#19 BackShot

    Member

  • PipPipPip
  • Littlest Helper
  • Littlest Helper
  • 87 posts

Posted 10 August 2020 - 06:46 AM

this is interesting, however after some simulations on your link, i am still a bit disapointed at the results.

it seems that with a 300 AMS a player need a 1.80 w/l to be in the tier 1 bracket.

With 350 AMS, the player needs 1.50 w/l
With 400 AMS, 1.30w/l is enough to be in the tier 1 bracket.
At 500 AMS you can even en up with a negative w/l and still be in the tier 1 bracket. ( 0.95 )

Too much weight for match score farming, not enough for winning.

That being said, group queuing in the soup padding W/L so much and not padding AMS, a heavily weighted W/L would be flawed too cause of that.

Still, to pad your W/L you have to group up with good players, and those good players have to do the same.
So the guys ending with 50 or more W/L by grouping up would have been tier 1 whatever the system.
4 mediocre players grouping up wont end up with a very good W/L but with a very bad one.

So think it is less an issue than match score padding.

#20 thievingmagpi

    Member

  • PipPipPipPipPipPipPipPip
  • 1,577 posts

Posted 10 August 2020 - 01:09 PM

View PostNightbird, on 09 August 2020 - 07:10 PM, said:

PGI and this community are a match made in heaven. Both score full 10s in ability to stick not only their heads but torsos into holes in the ground. It's quite a sight to see.


Weird.

You didn't answer the question.





4 user(s) are reading this topic

0 members, 4 guests, 0 anonymous users