Jump to content

Mech Balance With Science

Balance

53 replies to this topic

#1 Duke Nedo

    Member

  • PipPipPipPipPipPipPipPipPip
  • CS 2023 Top 12 Qualifier
  • CS 2023 Top 12 Qualifier
  • 2,184 posts

Posted 13 April 2015 - 12:10 PM

Perhaps this could be a basis for discussing the upcoming PGI BV balance pass? (necro justification) :)

As I understood it PGI are doing something along these lines, but more advanced and probably using game metrics as input... so they have a good chance to get better quality of input, but what I am curious if they intend to attempt this without any subjective input at all?

As I remember this, it would be really hard to catch a some things by objective numbers that can be extracted in a simple way, like:

- Heat efficiency. This will depend on lots of factors. XL engine vs geometry, Hardpoint distribution vs viable builds, # truedubs etc. Not straightforward to say if a mech is heat efficient or not without deciding which build you are going for...

- Hardpoints. The number and placement of hardpoints is one thing... but how viable are available asymmetric builds actually? The type of hardpoints vs the weight of the mech. How many of the hardpoints are placed high, does that reach the critical mass vs chassi? For example, for a heavy it can be enough with 2x B, for a light that would be useless, while it would need a critical mass of 3-4E clan mech and 5-6E for an IS mech....

I am sure there are plenty plenty more of those.

Any speculations?




Original OP:
----------------------------------------
Was a little bored so I made a very simple model to describe mech tiers. There are still subjective parameters, that's unavoidable, but I tried to do this with as little bias as possible.
----------------------------------------
Edit//

I have finished tweaking the weight now to the point where I think it won't get much better, a model like this has it's limitations and I think I have reached them. Before I post the last graphs below I'll fill in some of the missing information i the OP about the method used.

Input:
  • Hardpoint capacity (1-5): The total number of hardpoints and type. Score was given relative to class so that for example ballistic hardpoints for lights scored low, AC20 capability gave a small bonus.
  • Hardpoint locality (1-5): The distribution and position of the main weaponry hardpoints. Here high mounts and possibility for asymmetric builds would score high, main weaponry on arms would score low.
  • Armor (1-5): More or less spread even between score 5 to 1 within each weight class
  • Hitboxes (1-5): Here I tried to make this value reflect durability in terms of hitbox design as well as mech size. For example Spider, Stalker and Atlas would all score high, but for different reasons. Spider because of being generally thin and ghosty, Stalker for having a small CT and STD engine and Atlas for having very well proportioned hitboxes and big arms allowing it to roll and shield damage really well.
  • ECM mod (flag): Self explanatory
  • JJ mod (flag): Self explanatory
  • Clan XL (flag): Self explanatory
  • Engine mod (-0.5 to -2): Being locked/restricted to a too small engine. Applied for example to Adder, Kitfox and Direwolf.
  • Off quirks (0 to 1.5): subjective estimate of how many tiers of offense quirks add to the offensive metascore. Guideline here was that heat efficiency quirks would score high, a 10% heat bonus would give 0.5 etc. Cooldown of main weaponry would score higher than range/velocity in most cases (not ERPPC for example)
  • Def quirks (0 to 1): subjective estimate of how many tiers of defensive quirks add to the defensive metascore. This would include armor, structure, agility and speed. Torso generally rated higher than arms/legs unless for a light with leg problems.
Clustering:





The input was combined into two meta scores, one for offensive capability and one for defensive capability. W = weighting.

Off. meta score = (W x Hardpoints capacity + (1-W) x Hardpoints locality) + Off. quirks
Def. meta score = (W x Armor + (1-W) x Hitboxes) + Engine handicap + W x JJ + W x ECM + Def. quirks
Tier = 6 - (W x Off. meta score + (1-W) x Def. meta score)

Weightings:

Lights: JJ=0.75, Hardpoints=0.50, ECM=1, Clantech=2.0, Hitboxes=0.75, Offense=0.5
Mediums: JJ=0.50, Hardpoints=0.75, ECM=1, Clantech=1.5, Hitboxes=0.75, Offense=0.6
Heavy: JJ=0.25, Hardpoints=0.75, ECM=1, Clantech=1.5, Hitboxes=0.75, Offense=0.64
Assault: JJ=0.10, Hardpoints=0.75, ECM=1, Clantech=1.5, Hitboxes=0.75, Offense=0.66

There's a bit of trial and error to arrive at these weightings, but I can sort of see some logic in what I ended up with. For example, JJ being more important for lights since they actually jump instead of hovering. Hardpoints locality being less important for lights than for larger mechs since they are so fast they have less need for ridge humping. Clantech being more important for lights because light weapon allow more firepower, while for bigger mechs it becomes less important because of heat. Offense gets relatively more important that defense the bigger you are could also make sense because of the lower agility and speed. If you are slow and get hit it's perhaps more likely that your tank won't save you. These could be discussed til the end of the world, but I think these does the job quite alright.

The graphs:

Posted Image

Conclusion: I put this together just for fun, and I must say that it has given a result that I think is more consistent than I had expected. I will now try the predictive power of is to guesstimate the Tiers of the upcoming Wave III mechs, and perhaps also to predict the size of quirks needed to move one of the worst mech to a solid Tier 2.0.

Clan Wave III prediciton:

ACH-Prime
Input: Hardpoints capacity=5, hardpoints locality=4, armor=4, hitboxes=3, ecm, jj, clantech
Output: Tier 0.3
Comments: Hardpoints are excellent, locality looks very good too with 3E in the torsi, went modestly with a 3 on hitboxes and still end up with a better score than any other Light in the game.


SHC-Prime
Input: Hardpoints capacity=3, hardpoints locality=4, armor=2, hitboxes=3, ecm, jj, clantech
Output:Tier 1.5
Comment: Again, excellent hardpoint locality, could be a 5:er from the looks of concept art. Went conservative with a 4 and a 3 for capacity. If hardpoints are as good as concept or if hitboxes are above average though, this quickly turns below Tier 1.0 so there is lots of potential.

EBJ-Prime
Input: Hardpoints capacity=5, hardpoints locality=5, armor=3, hitboxes=3, clantech
Output: Tier 1.2
Comment: Again, awesome hardpoints, here I went all in with 5/5 for hardpoints, medium armor and hitboxes and end up with a Tier just behind Hellbringers. If hitboxes are above average we may have another winner.

EXE-Prime
Input: Hardpoints capacity=4, hardpoints locality=2, armor=5, hitboxes=4, clantech
Output:Tier 2.4
Comment: Hardpoints come in plenty, but limited podspace, so a 4. Locality down to 2, majority of E hardpoints in the arms and they look really low slung like the Gargoyle. Gave it 5 armor and good hitboxes assuming they will be similar to the Gargoyles. All in all an average mech, right among the Gargoyles. The 3 nipple-E's may help quite a bit, but I still think that PGI could safely ship this guy with some decent quirks.

All in all, looking forward to Wave III. If you didn't buy it already, order today! :)

DRG-5N rescue: Model concludes it can't be done. It can be put on the 1N level by giving it similar level of AC/2 quirks but that's about it. Even then, it could safely receive more armor/agility and would still not break the game in any way.


//Below is the original OP:
----------------------------------------
The model is very straightforward, I've given each variant a score for: Hardpoint capacity (including podspace for clams), Hardpoint locality (height and asymmetry), Armor, Hitboxes, Offense quirks and Defense quirks. I then also added flags for JJ, ECM and clan xl engine. These scores were combined into an offense meta score and a durability meta score, which were then combined into a Tier value. I choose some weights, so I weighted hardpoint locality 3 times more than hardpoint capacity (except for lights), I weighted hitboxes 3 times more than armor and I weighted JJ as 0.25 for assault, 0.5 for heavy, 0.75 for mediums and 1.0 for lights. Mechs with engine restrictions got a handicap. I then mashed it all together. The nice part is that now that I have set it up I can play around with any weightings and scores to improve accuracy.

The results slide:
Posted Image

There are definitely lots of weaknesses in this model, its simple and in some cases rather subjective. I am not sure how to share the input with you but I'll be happy to try. Nevertheless, there are some interesting observations.

I think it visualizes quite nicely that the quirks have really helped some chassi, and the quirked tier-list is much more consistent compared to the unquirked one, and it also shows that there are quite a few variants that are still far behind.

There are a few surprises, or outliers, or shortcomings in the model if you will. I am thinking that speed is under-estimated for lights, and cooling capacity doesn't really go in there at all. Quite possibly clan-tech needs to weighted in more because I think the SCR's are a bit behind, but I think a big contribution to that is the extremely wonky hitboxes and they are hard to describe. The DWF is a bit behind, I am not sure if that is wrong or not... it's not my favorite chassi so I'm fine with it but I guess many will disagree...

Anyways, I think a model like this is a good starting point for balance discussions... so, discuss?

Edited by Duke Nedo, 14 August 2015 - 04:34 AM.


#2 EgoSlayer

    Member

  • PipPipPipPipPipPipPipPip
  • Wrath
  • Wrath
  • 1,909 posts
  • Location[REDACTED]

Posted 13 April 2015 - 12:20 PM

While it's an interesting view, I think that most of the numbers you are basing it on are subjective (Hardpoint capacity (including podspace for clams), Hardpoint locality (height and asymmetry), Armor, Hitboxes, Offense quirks and Defense quirk).

Nothing wrong with that, but I am sure there would a lot of discussion about what scales and values you used to arrive at your numbers for these categories that feed it.

And is the scale supposed to represent something like the Tier list so that everything is normalized into the range of 1 to 6?

Edited by EgoSlayer, 13 April 2015 - 12:24 PM.


#3 Y E O N N E

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Nimble
  • The Nimble
  • 16,810 posts

Posted 13 April 2015 - 12:21 PM

Can we see:

A.) Your decision tree; the hierarchy of attributes with the weights for each one its parent class listed

B.) Your value scaling; did you score the values on a consistent scale (i.e. zero-to-one) and are they valued according to a linear, exponential, or other function?

#4 FupDup

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 26,888 posts
  • LocationThe Keeper of Memes

Posted 13 April 2015 - 12:22 PM

Locust 1E above Firestarter A, Raven 2X, and Huginn? Wat?

#5 Mawai

    Member

  • PipPipPipPipPipPipPipPipPip
  • Legendary Founder
  • Legendary Founder
  • 3,495 posts

Posted 13 April 2015 - 12:25 PM

The only comment I would have ... I hope PGI has used some sort of similar formula for determining the quirks in the first place :)

Also, it is difficult to assess whether personal bias plays a role. Unless you have extensively played every mech, criteria such as hit boxes has to be based on hear say from players who have played them ... which then becomes a question of whose opinion do you listen to ... and how you choose to weight each factor.


P.S.

Looking at some of the rankings ... I wonder how objective it is ... for example the Stalker-4N is the highest ranked Assault mech. It is very effective in CW and has quite nice large laser quirks ... but most decent Dire Wolves would beat one in a 1:1 fight (my opinion ... others may feel differently). The same goes for several other assaults ... unless they decide to run at the Stalker across an open field armed with only short range weapons.

Similarly, although the HBK-4G has been amazingly improved by quirks ... I think it is still outclassed by the Stormcrow in almost every way ... and I pilot both.

Finally, no matter how well quirked the LCT-1E might be ... and I have seen folks perform miracles in a locust ... it is still a 20 ton mech that can be one-shotted or legged by almost everything. In terms of tier ranking I don't think it can really be ranked above any firestarter, jenner or raven in terms of effectiveness as a light mech.

So ... I think one aspect that could be added to your evaluation is tonnage. I don't think you mentioned a factor based on actual tonnage ... which might go some ways to eliminating some of these oddities. (Add a factor for every 5 tons above the minumum weight of the weight class).

Edited by Mawai, 13 April 2015 - 12:37 PM.


#6 dubplate

    Member

  • PipPipPipPipPip
  • 153 posts
  • LocationBC, Canada

Posted 13 April 2015 - 12:32 PM

View PostDuke Nedo, on 13 April 2015 - 12:10 PM, said:

Anyways, I think a model like this is a good starting point for balance discussions... so, discuss?


I think there are too many variables (and personal opinion) involved to boil it down to a single number as a representation of a mech's effectiveness. Maybe once we get the full procedure used we can say something about it but otherwise it's basically your opinion in a chart.

#7 Khobai

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • Elite Founder
  • 23,969 posts

Posted 13 April 2015 - 12:34 PM

Why is the timberwolf the worst heavy?

#8 FupDup

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 26,888 posts
  • LocationThe Keeper of Memes

Posted 13 April 2015 - 12:35 PM

View PostKhobai, on 13 April 2015 - 12:34 PM, said:

Why is the timberwolf the worst heavy?

His graph is counter-intuitive in the way it represents effectiveness. In this graph, the lower bar is actually better because it's closer to Tier 1. The higher bars are worse tiers like T4 or T5. It's kinda wonky.

#9 Khobai

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • Elite Founder
  • 23,969 posts

Posted 13 April 2015 - 12:36 PM

Quote

His graph is counter-intuitive in the way it represents effectiveness. In this graph, the lower bar is actually better because it's closer to Tier 1. The higher bars are worse tiers like T4 or T5. It's kinda wonky.


then why is the dragon the worst heavy? thats not right either.

#10 Duke Nedo

    Member

  • PipPipPipPipPipPipPipPipPip
  • CS 2023 Top 12 Qualifier
  • CS 2023 Top 12 Qualifier
  • 2,184 posts

Posted 13 April 2015 - 12:36 PM

View PostEgoSlayer, on 13 April 2015 - 12:20 PM, said:

While it's an interesting view, I think that most of the numbers you are basing it on are subjective (Hardpoint capacity (including podspace for clams), Hardpoint locality (height and asymmetry), Armor, Hitboxes, Offense quirks and Defense quirk).

Nothing wrong with that, but I am sure there would a lot of discussion about what scales and values you used to arrive at your numbers for these categories that feed it.

And is the scale supposed to represent something like the Tier list so that everything is normalized into the range of 1 to 6?


Yeah, it has to be rather subjective I guess, but I tried to be somewhat systematic about it. Scores were given between 1-5. Quirks were estimated as Tier change between ~0.25 and 1.5, that's probably the most subjective input.

View PostYeonne Greene, on 13 April 2015 - 12:21 PM, said:

Can we see:

A.) Your decision tree; the hierarchy of attributes with the weights for each one its parent class listed

B.) Your value scaling; did you score the values on a consistent scale (i.e. zero-to-one) and are they valued according to a linear, exponential, or other function?


Let's see,

Offense meta score was: AVG(capacity; weight x locality)+Off.quirks
Defensive meta score was: AVG(armor; weight x hitboxes)+Def quirks+weight x JJ + weight x ecm + weight x clanXL
Tier was: k - AVG(weight x Off.meta score; Def.meta score)

Weights were chosen slightly differently for different classes, especially JJ's.

Did this for fun, but it was actually more accurate with regard to my completely subjective impression that I had expected. Self-fulfilling prophecy perhaps, but anyways, I expected it to fail moar. :)

View PostFupDup, on 13 April 2015 - 12:22 PM, said:

Locust 1E above Firestarter A, Raven 2X, and Huginn? Wat?


Well, the 1E is not bad... :) but yeah I agree, it turned up very high up in the list. It has very good hardpoints and quirks though, so it could be up there as long as you don't get hit... perhaps durability should be scaled up a bit for lights, could be that it's more important for lights since you take a lot more stray fire than big alphas as a light.

#11 FupDup

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 26,888 posts
  • LocationThe Keeper of Memes

Posted 13 April 2015 - 12:37 PM

View PostKhobai, on 13 April 2015 - 12:36 PM, said:

then why is the dragon the worst heavy? thats not right either.

He ranked the 5N as the worst. He ranked the 1N at about T3, which I guess still might be slightly too conservative. I could see it as a T2 mech possibly, but hitboxes, tonnage, and hardpoints keep it from being T1.

#12 Cion

    Member

  • PipPipPipPipPipPipPip
  • The Spear
  • The Spear
  • 750 posts

Posted 13 April 2015 - 12:38 PM

I agree there are flaws (what is that dragon doing all the way back there?) but I like where this could go.

My suggestion to OP would be to get feedback and improve the algorithm to be more accurate. It will never be 100% accurate, but you got a good result (in general) from this first pass.

#13 Duke Nedo

    Member

  • PipPipPipPipPipPipPipPipPip
  • CS 2023 Top 12 Qualifier
  • CS 2023 Top 12 Qualifier
  • 2,184 posts

Posted 13 April 2015 - 12:39 PM

View PostFupDup, on 13 April 2015 - 12:35 PM, said:

His graph is counter-intuitive in the way it represents effectiveness. In this graph, the lower bar is actually better because it's closer to Tier 1. The higher bars are worse tiers like T4 or T5. It's kinda wonky.


I tried to model PGI's tiers you know... so inverting the tiers would probably be worse. ^^

#14 Gas Guzzler

    Member

  • PipPipPipPipPipPipPipPipPipPipPip
  • Big Daddy
  • Big Daddy
  • 14,274 posts
  • LocationCalifornia Central Coast

Posted 13 April 2015 - 12:40 PM

Think like golf guys, lower number is better.

#15 Duke Nedo

    Member

  • PipPipPipPipPipPipPipPipPip
  • CS 2023 Top 12 Qualifier
  • CS 2023 Top 12 Qualifier
  • 2,184 posts

Posted 13 April 2015 - 12:41 PM

View PostFupDup, on 13 April 2015 - 12:37 PM, said:

He ranked the 5N as the worst. He ranked the 1N at about T3, which I guess still might be slightly too conservative. I could see it as a T2 mech possibly, but hitboxes, tonnage, and hardpoints keep it from being T1.


Aye, the 5N I can see back there, but the Flame and 1C are perhaps a bit far back. The quickdraw got a boost from having JJ, otherwise I don't know which heavy would be worse...

#16 RAM

    Member

  • PipPipPipPipPipPipPipPipPip
  • The Resolute
  • The Resolute
  • 2,019 posts
  • Google+: Link
  • LocationVancouver, BC

Posted 13 April 2015 - 12:42 PM

Very interesting. Could you do a single consolidated graph of all mechs? Thanks!


RAM
ELH

#17 Jman5

    Member

  • PipPipPipPipPipPipPipPipPip
  • Littlest Helper
  • Littlest Helper
  • 4,914 posts

Posted 13 April 2015 - 12:57 PM

I think you should consider removing the "unquirked" bars from your graph. It's already rather dense with so many mechs, and the 2nd light blue bar is just making it harder for us to interpret without adding a whole lot IMO.

I would also suggest separating the images and making them bigger. There are so many mech values it can be hard to read.

Another suggestion is to label your Y axis because as others pointed out it's a little counter intuitive that lower = better. Also normally people sort their X axis from worst to best instead of best to worst. Not a big deal, but it does reinforce the confusion.

#18 Y E O N N E

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Nimble
  • The Nimble
  • 16,810 posts

Posted 13 April 2015 - 01:08 PM

View PostFupDup, on 13 April 2015 - 12:22 PM, said:

Locust 1E above Firestarter A, Raven 2X, and Huginn? Wat?


Raven 2X is toast at close range if using the ERLL/LL build that put it on the map in the first place. Huginn requires you to be an excellent shot with SRMs and is easy to hold outside its max range. FS9 has to expose itself entirely to fire.

The Locust 1E should have an incredible offensive score, especially with its quirks, speed, and hard-point locations. Armor durability is the only real place it gets [massively] let down. I think this perhaps inflates its utility a bit over the FS9, though I can think of a few scenarios where, with two equal pilots and the accepted "meta" builds on both, the Locust should win.

View PostDuke Nedo, on 13 April 2015 - 12:36 PM, said:


Let's see,

Offense meta score was: AVG(capacity; weight x locality)+Off.quirks
Defensive meta score was: AVG(armor; weight x hitboxes)+Def quirks+weight x JJ + weight x ecm + weight x clanXL
Tier was: k - AVG(weight x Off.meta score; Def.meta score)

Weights were chosen slightly differently for different classes, especially JJ's.

Did this for fun, but it was actually more accurate with regard to my completely subjective impression that I had expected. Self-fulfilling prophecy perhaps, but anyways, I expected it to fail moar. :)


Are "Off.quirks" and "Def quirks" variables representing the number of such quirks? I don't think the quirks should have been straight-added like that. I'm also guessing all quirks are offensive unless they are armor or structure quirks, yes? Is JJ the number of JJ?

See, weights are supposed to represent the impact an attribute has. You didn't assign weights to each of those two sub-scores, and that makes it, unfortunately, pretty meaningless.

You should have, at the top level, a single attribute called something like "Mech Utility," and its maximum value should be 1. Under that, you should have your two attributes for Offense and Defense, each with its own weight and summing to 1. Then under each of those, you should have a break-down of the things that comprise your two main attributes, and they should each have their own weights which also sum to 1 under their respective categories.

So, what you want is a utility score. The best 'Mech should have a utility score of 1, and the worst 0. You can categorize the 'Mechs according to their score like this:
  • Tier V: 0-0.19
  • Tier IV: 0.2-0.39
  • Tier III: 0.4-0.59
  • Tier II:0.6-0.79
  • Tier I 0.8-0.99
  • MECH OP PGI PLZ NERF: 1
Of course, you also have to parameterize the values of each 'Mech's measured component on a 0-1 scale as well. i.e., no heat gen quirk gets a score of 0 as being the worst option while a heat gen quirk of 25% gets a score of 1 as being the best, with everything in between being some fractional value between 0 and 1 (assuming linear valuation).

Now don't get me wrong, I think it's neat that you've done this, even for fun. However, there is a way to do it that would be far more useful to everybody, if you have the time and interest to do so.

#19 Kiiyor

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Big Daddy
  • Big Daddy
  • 5,565 posts
  • LocationSCIENCE.

Posted 13 April 2015 - 02:02 PM

Nice SCIENCE, but as it's anecdotal, I have a couple of anecdotal opinions:

DireWolf #1. Everything else #2 or lower.

There is no mech better than a StormCrow.

#20 Brizna

    Member

  • PipPipPipPipPipPipPipPip
  • Liquid Metal
  • Liquid Metal
  • 1,363 posts
  • LocationCatalonia

Posted 13 April 2015 - 03:01 PM

I think in your effort to make it objective and scientific you left out something that matters a lot. How well the different qualities of a mech (hitboxes, quirks, hardpoints...) mix together to make a whole mech. I.e: Warhawk has in theory very nice qualities but they don't match each other well making most builds weak.

In the same way Mist Lynx has no single quality that impresses and a few that literally suck but with the quirks it makes a surprisingly decent mech.

Of course if you added that you'd be back where you started, completely subjective charts, and that's exactly where most of the interest of these charts lies in comparing a mech's true permfromance on the battlefield (subjective as it is) with its theoric performance based solely in stat. Very interesting read and I thank you very much for sahring your work. PGI would do well in looking at them very closely to see where some mechs failed.





2 user(s) are reading this topic

0 members, 2 guests, 0 anonymous users