Jump to content

Statistical Analysis Of The 12-0


187 replies to this topic

#21 Dimento Graven

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Guillotine
  • Guillotine
  • 6,208 posts

Posted 30 January 2017 - 02:32 PM

View PostTarogato, on 30 January 2017 - 02:30 PM, said:

...

While a Locust can easily solo kill many assault mechs, you can't deny that having 800 tons is stronger than having 700 tons.

...
Nope, that's dependent on the makeup of that tonnage. 700 tons of Clan 'mechs would beat the living **** out of most compositions of 800 tons of IS 'mech.

#22 razenWing

    Member

  • PipPipPipPipPipPipPipPip
  • The Fearless
  • The Fearless
  • 1,694 posts

Posted 30 January 2017 - 02:32 PM

I know the conclusion that you are reaching for, but I ask that one more comparison be considered.

On closely contested games, then by logic, their damage/kdr/wlr should be close to identical.

Also, statistical significance is not judged by mere percentage differences. I would say, of all the stats you collects, based on experience, the GMAN rating is probably completed trash and outside of the statistical odds. Just to show that metamechs are way overrated.

Otherwise, the basis of your conclusion is simple... pilots with more wins/kdr/damage win out against pilots with less wins/kdr/damage.

Now, to get back to what I stated in the beginning. To truly draw the conclusion that you did, my setup of control group is necessary because it either reinforce the theory or debunk it.

12-0, 12-1, or ever 12-2 games are fairly rare. So you are drawing a general conclusion based on a sample percentage already far less than others (though 1200 games is statistically significant, so no doubt about that) I just want to see more proof.

Will we see a trend where stomp(12-0, 12-1, 12-2) to somewhat competitive (12-3, 12-4, 12-5) to competitive (12-6 to 12-9), and nail biter (12-10 to 12-11) and see that pilot wlr/kdr/damage is shifting more and more toward identical values?

I think if we can do that, then we can definitively prove that the PSR and match making is indeed... broken. (But good start and good methodology for initial research, good job!)
-------------

Edit: Also keep in mind that a few weeks ago, Russ did confirm through Twitter that PGI changed matchmaking mechanics without a public announcement. How much that will affect your conclusion from this point forward? I don't know. But it's worth finding out.

Edited by razenWing, 30 January 2017 - 02:35 PM.


#23 Davegt27

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 6,970 posts
  • LocationCO

Posted 30 January 2017 - 02:37 PM

Need avg dmg output per match and avg kill per match

#24 xTrident

    Member

  • PipPipPipPipPipPipPip
  • Bad Company
  • 655 posts
  • LocationWork or Home

Posted 30 January 2017 - 02:40 PM

View PostBud Crue, on 30 January 2017 - 01:58 PM, said:

Very cool.

I'm intrigued by the "Gman Tonnage" equations and aspect, particularly in that it seems obvious that tonnage is at best a secondary consideration both empirically and based on your data (negative correlation), yet PGI by word and deed seems convinced that tonnage more than most anything else is the most significant factor in determining balance and probability of wins (see group que tonnage restrictions, see attempts at addressing population imbalance by tonnage differences in CW, etc.).

Your study brings this to a head and I can't help but wonder why they insist on believing that absolute tonnage is what seemingly matters most?


Bud, I'd be the first - before this analysis - to have said a team with more tonnage... Of course I'm talking fairly significantly more tonnage, would be the winning the team. Because I've seen it happen again and again. But lately it seems tonnage hasn't been as important. Been wrecked a few times in group queue against an 11 or 12 man unit so I knew going into the match my team was going to have a tonnage advantage. We still lost and it wasn't that close tonnage differences taken into consideration.

#25 Tarogato

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Civil Servant
  • Civil Servant
  • 6,557 posts
  • LocationUSA

Posted 30 January 2017 - 02:41 PM

View PostrazenWing, on 30 January 2017 - 02:32 PM, said:

I know the conclusion that you are reaching for, but I ask that one more comparison be considered.

On closely contested games, then by logic, their damage/kdr/wlr should be close to identical.

[...]
To truly draw the conclusion that you did, my setup of control group is necessary because it either reinforce the theory or debunk it.

12-0, 12-1, or ever 12-2 games are fairly rare. So you are drawing a general conclusion based on a sample percentage already far less than others (though 1200 games is statistically significant, so no doubt about that) I just want to see more proof.

Will we see a trend where stomp(12-0, 12-1, 12-2) to somewhat competitive (12-3, 12-4, 12-5) to competitive (12-6 to 12-9), and nail biter (12-10 to 12-11) and see that pilot wlr/kdr/damage is shifting more and more toward identical values?

I think if we can do that, then we can definitively prove that the PSR and match making is indeed... broken. (But good start and good methodology for initial research, good job!)

I agree. I'd like to do that follow up. Though... I might have to request some crowd-sourcing for the screenshots and data entry. Entering in all those names and mechs gets old after a while (and I'm not sure I'd trust an OCR output, nor do I have such a resource myself).



Quote

Also, statistical significance is not judged by mere percentage differences. I would say, of all the stats you collects, based on experience, the GMAN rating is probably completed trash and outside of the statistical odds. Just to show that metamechs are way overrated.

Oh I agree completely. Basing something off of Metamechs is immediately dubious. I would have only put stock into it if there was a very strong correlation to winning. But... it was rather inconclusive, so better to just ignore it for now and save it for later.

#26 Magnus Santini

    Member

  • PipPipPipPipPipPipPip
  • The Tip of the Spear
  • The Tip of the Spear
  • 708 posts

Posted 30 January 2017 - 03:05 PM

The weird thing is I would have expected massacre games to have much higher advantages to the winning team in one or more of the categories you looked at. But I guess statistically it can work out. Mostly people talk about a player's error in positioning and his death causing a snowball. To me the stomps get caused when the team does not communicate enough to adjust for the individual mech role builds and talents that make up the team. If everybody wants to sit in defense, or push to another place, that is fine, but the team needs to do one or the other together. After the first five minutes it is too late to realize that half the team are LRMs or snipers and will not be moving up.

#27 Tarogato

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Civil Servant
  • Civil Servant
  • 6,557 posts
  • LocationUSA

Posted 30 January 2017 - 03:12 PM

View PostMagnus Santini, on 30 January 2017 - 03:05 PM, said:

The weird thing is I would have expected massacre games to have much higher advantages to the winning team in one or more of the categories you looked at. But I guess statistically it can work out. Mostly people talk about a player's error in positioning and his death causing a snowball. To me the stomps get caused when the team does not communicate enough to adjust for the individual mech role builds and talents that make up the team. If everybody wants to sit in defense, or push to another place, that is fine, but the team needs to do one or the other together. After the first five minutes it is too late to realize that half the team are LRMs or snipers and will not be moving up.


In my experience, the vast vast majority of solo queue matches, there is zero communication. I disabled VOIP this week and haven't noticed a difference. ¯\_(ツ)_/¯

Also, it's definitely likely (more like, indisputable) that spurts of communication can throw statistical determination/predictability out of the window. =D

#28 Davegt27

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 6,970 posts
  • LocationCO

Posted 30 January 2017 - 03:20 PM

what you should do is strip the modules off some of your Mechs

and do a comparison of combat effectiveness

that is avg dmg and avg kill and time in battle

#29 Jman5

    Member

  • PipPipPipPipPipPipPipPipPip
  • Littlest Helper
  • Littlest Helper
  • 4,914 posts

Posted 30 January 2017 - 03:35 PM

Great post. By the way, there seems to be a lot of discussion about the Clan vs IS numbers. As it happens last month I was also curious if there was any connection here. I went through about 75 games in the solo queue and looked at whether or not the winning team had more clan mechs.

The team with more clan mechs won 58% of the time.

Many of the upsets had some pretty obvious non-combat reasons such as objective victories, or disconnects on the losing team. Then there were some games where the team made a huge tactical blunder like cap rushing with part of the team and quickly being down half your team.

#30 Tarogato

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Civil Servant
  • Civil Servant
  • 6,557 posts
  • LocationUSA

Posted 30 January 2017 - 03:35 PM

View PostDavegt27, on 30 January 2017 - 03:20 PM, said:

what you should do is strip the modules off some of your Mechs

and do a comparison of combat effectiveness

that is avg dmg and avg kill and time in battle


I already did this, actually. Sorta. Maybe more extreme.

What I do is when I level my mechs, I always play 20-30 matches with no Basics and no modules, I just stockpile the XP. Then when I have enough XP, I spend it all at once and Elite the mech, and then play another 20-30 matches with Elites.

My findings were contradictory. Sometimes my mechs actually performed better with no skills unlocked, and the performance drops when I Elite them. Other times it was the other way around. The only conclusion I could draw from that is that skills don't have as dramatic an effect as many people claim they do. But that's only true for me and my playstyle.

Edited by Tarogato, 30 January 2017 - 03:38 PM.


#31 Aramoro999

    Member

  • PipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 214 posts

Posted 30 January 2017 - 03:40 PM

View PostTarogato, on 30 January 2017 - 01:43 PM, said:

Posted Image

Onion, what a **** player...
more surprised how well potato is doing

Edited by Aramoro999, 30 January 2017 - 03:40 PM.


#32 Davegt27

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 6,970 posts
  • LocationCO

Posted 30 January 2017 - 03:55 PM

Quote

I already did this, actually. Sorta. Maybe more extreme.

What I do is when I level my mechs, I always play 20-30 matches with no Basics and no modules, I just stockpile the XP. Then when I have enough XP, I spend it all at once and Elite the mech, and then play another 20-30 matches with Elites.

My findings were contradictory. Sometimes my mechs actually performed better with no skills unlocked, and the performance drops when I Elite them. Other times it was the other way around. The only conclusion I could draw from that is that skills don't have as dramatic an effect as many people claim they do. But that's only true for me and my playstyle.


I noticed the same thing

#33 xTrident

    Member

  • PipPipPipPipPipPipPip
  • Bad Company
  • 655 posts
  • LocationWork or Home

Posted 30 January 2017 - 04:06 PM

View PostJman5, on 30 January 2017 - 03:35 PM, said:

Great post. By the way, there seems to be a lot of discussion about the Clan vs IS numbers. As it happens last month I was also curious if there was any connection here. I went through about 75 games in the solo queue and looked at whether or not the winning team had more clan mechs.

The team with more clan mechs won 58% of the time.

Many of the upsets had some pretty obvious non-combat reasons such as objective victories, or disconnects on the losing team. Then there were some games where the team made a huge tactical blunder like cap rushing with part of the team and quickly being down half your team.


58%? That really doesn't seem like all that much more for Clans. Not when I see so many posts about them being so OP. 58% is nothing really.

#34 Zookeeper Dan

    Member

  • PipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 487 posts
  • LocationBeer City USA

Posted 30 January 2017 - 04:20 PM

View PostrazenWing, on 30 January 2017 - 02:32 PM, said:

I know the conclusion that you are reaching for, but I ask that one more comparison be considered.

On closely contested games, then by logic, their damage/kdr/wlr should be close to identical.

Also, statistical significance is not judged by mere percentage differences. I would say, of all the stats you collects, based on experience, the GMAN rating is probably completed trash and outside of the statistical odds. Just to show that metamechs are way overrated.

Otherwise, the basis of your conclusion is simple... pilots with more wins/kdr/damage win out against pilots with less wins/kdr/damage.

Now, to get back to what I stated in the beginning. To truly draw the conclusion that you did, my setup of control group is necessary because it either reinforce the theory or debunk it.

12-0, 12-1, or ever 12-2 games are fairly rare. So you are drawing a general conclusion based on a sample percentage already far less than others (though 1200 games is statistically significant, so no doubt about that) I just want to see more proof.

Will we see a trend where stomp(12-0, 12-1, 12-2) to somewhat competitive (12-3, 12-4, 12-5) to competitive (12-6 to 12-9), and nail biter (12-10 to 12-11) and see that pilot wlr/kdr/damage is shifting more and more toward identical values?

I think if we can do that, then we can definitively prove that the PSR and match making is indeed... broken. (But good start and good methodology for initial research, good job!)
-------------

Edit: Also keep in mind that a few weeks ago, Russ did confirm through Twitter that PGI changed matchmaking mechanics without a public announcement. How much that will affect your conclusion from this point forward? I don't know. But it's worth finding out.


I agree. This is a very impressive data set with interesting analysis. However because you selectively chose the data you cannot apply any conclusions to broader matchmaking. You can only draw conclusions for the set of data collected.

What's missing most is what percentage of games are stomps. If only 5% of games are stomps we're probably paying too much attention to the outliers and the matchmaker is working well.

#35 Tarogato

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Civil Servant
  • Civil Servant
  • 6,557 posts
  • LocationUSA

Posted 30 January 2017 - 04:38 PM

View PostZookeeper Dan, on 30 January 2017 - 04:20 PM, said:

I agree. This is a very impressive data set with interesting analysis. However because you selectively chose the data you cannot apply any conclusions to broader matchmaking. You can only draw conclusions for the set of data collected.

What's missing most is what percentage of games are stomps. If only 5% of games are stomps we're probably paying too much attention to the outliers and the matchmaker is working well.



Wellllllll...

... I know when I started collecting screenshots and when I stopped. And by pure coincidence, both were right around leaderboard resets.

So between October and January I collected 88 stomps from my own games. I also excluded a few for reasons such as disconnects, or "it was a 12-2, but it didn't feel very stompy". So let's just say 95 stomps.

On the QP leaderboard, since October I've played 1313 matches, the vast majority of which were solo queue. So let's just say... maybe 1250 were solo queue.

So the stomps were something like 7% of the matches I played. At most 8%, though probably nearer 7%. But hey, I'm just one data point. =3

Edited by Tarogato, 30 January 2017 - 04:40 PM.


#36 XX Sulla XX

    Member

  • PipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • 3,094 posts

Posted 30 January 2017 - 04:44 PM

Looks like what we should have is a zero sum tier system, but also take into account match score per weight class. As the match score between medium and a assault etc is different for most players.

#37 White Bear 84

    Member

  • PipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 3,857 posts

Posted 30 January 2017 - 04:50 PM

Look at it like this. Tier 1 vs Tier 1 is not a fair matchup.

But why do you say? Because one of those T1's is going to be a better player than the other. So if you just assume an even balance by matching up the same number of tier 1's or tier 2's against each other, you are going to have 12-0 rolfstomps. I mean lets say you pit your average tier 1 pug players against a team of tier 1 pro players. Who do you suppose is most likely to win?

That said, tactical decisions also have a high important, teams that are less decisive and less responsive to the enemy team are probably going to lose badly more often, stuff like dragged out nascars, balling up on the same spot and getting surrounded, scattering (I mean yeah, tier 1 and this s**t still happens..) etc etc. Again, comms will be a huge factor & that more than often enough comes down to how competitive a team is playing, interesting to see how often a 12-0 comes from a match where your team has regular and good quality comms AND the team makes good tactical choices..

#38 xTrident

    Member

  • PipPipPipPipPipPipPip
  • Bad Company
  • 655 posts
  • LocationWork or Home

Posted 30 January 2017 - 04:52 PM

View PostTarogato, on 30 January 2017 - 04:38 PM, said:



Wellllllll...

... I know when I started collecting screenshots and when I stopped. And by pure coincidence, both were right around leaderboard resets.

So between October and January I collected 88 stomps from my own games. I also excluded a few for reasons such as disconnects, or "it was a 12-2, but it didn't feel very stompy". So let's just say 95 stomps.

On the QP leaderboard, since October I've played 1313 matches, the vast majority of which were solo queue. So let's just say... maybe 1250 were solo queue.

So the stomps were something like 7% of the matches I played. At most 8%, though probably nearer 7%. But hey, I'm just one data point. =3


In my opinion far more stomps there there should be as well.

#39 Cy Mitchell

    Member

  • PipPipPipPipPipPipPipPipPip
  • The Privateer
  • The Privateer
  • 2,688 posts

Posted 30 January 2017 - 05:08 PM

View PostTarogato, on 30 January 2017 - 02:20 PM, said:

I do want to break it up by weight class, and also break up IS vs Clan by weight class as well, and see if there's any patterns. But not today, I spent long enough typing this post up. =D


I assume that you were not able to obtain the actual PSR levels of the players in the matches so perhaps not all were Tier 1 or Tier 2. Another factor that may influence the outcome of a game is a players proficiency in a specific weight class. Did you look up each players Leaderboard stats in each weight class or did yo use the global stats? In solo QP it is often the case that players are leveling Mechs they may be unfamiliar with and not very skilled at using. For example, if they normally favor Lights but just recently got the Marauder IIC and they are trying to complete the basic skill they will probably not perform up to their Global stats that have been achieved by driving the Mechs they normally use.

I love looking at your spreadsheets and appreciate all the work you put into them but I wonder if you can truly draw any solid conclusions from them. I have watched many MSRB and MWOWC matches that ended in 8-0 or 12-0 stomps. Nearly everyone in those matches were Tier 1. Everyone was using the best Mechs available. All are very skilled players. Yet, stomps happen. People make mistakes, someone is in the right place at the wrong time, communication is lacking, someone gets a lucky shot and suddenly the snowball starts rolling downhill and takes the whole unlucky team with it. This is the case in comp and it is much more often the case in solo QP.

I would very much like to see MM take into account each individuals performance in each individual weight class instead of PSR when distributing players to each team.

#40 Ted Wayz

    Member

  • PipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 2,913 posts
  • LocationTea with Romano

Posted 30 January 2017 - 05:25 PM

Things missing I would like to see:

Time to first kill

% team with disconnect

I understand that you are talking about stomps and I am wondering how much is attributed to teams playing from behind early. I have seen stomps result from a team going up 1-0 and waiting for the other team to press.

Discos, especially valuable discos, can also play a factor. Your stats will hide a team with 100 ton disco as they will most likely lose yet have an inflated tonnage. Need to know tonnage active during the match.





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users