Jump to content

Elo Ratings And Granularity


  • You cannot reply to this topic
52 replies to this topic

#41 Corvus Antaka

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Knight Errant
  • Knight Errant
  • 8,310 posts
  • Twitch: Link
  • LocationInner Sphere

Posted 12 December 2013 - 07:31 AM

View PostJoseph Mallan, on 12 December 2013 - 05:59 AM, said:

The thing that makes JJ the most OP is that in universe firing weapons happen after all teh Gee forces are done and you have landed after the jump.


this and the fact that you can shoot and retreat into cover so rapidly - much faster than using forward/reverse to step around buildings and fire.

JJ design in this game is terrible, and until jumpjets themselves are "fixed" and the mechinc is re-worked to something like the btech 3025 type jets (which are how btech JJ are actually supposed to work) I doubt we will see any improvment, change weapons balance all you want, but current JJ design is the root issue causing most of the problems people have with pinpoint damage, etc.

#42 BillyM

    Member

  • PipPipPipPipPipPipPip
  • 530 posts

Posted 12 December 2013 - 07:43 AM

View PostFupDup, on 11 December 2013 - 03:40 PM, said:

I will never forget the soul-crushingly painful experience of grinding Locusts at my Raven's Elo level.


Posted Image

On the topic of jumpjets, I think the recharge rate of JJ's need to be reduced considerably. Double the JJ recharge time for lights and mediums, triple it for heavies, and quadruple it for assaults. ...should not be used for bunny-hopping or pop-tarting, should be used for mobility.

--billyM

Edited by BillyM, 12 December 2013 - 07:48 AM.


#43 Roadkill

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,610 posts

Posted 12 December 2013 - 09:50 AM

The only thing I would change, Bill, is that I'd just go ahead and have the system maintain extra Elo ratings. As I recall, Elo ratings don't average linearly, so what you've set up could result in wonky pairings at least until the 50-match threshold was met for a given variant.

Computing Elo ratings is pretty trivial, CPU-wise, so the system might as well just go ahead and maintain multiple ratings. (Beyond the 4 already being maintained, that is.)

Thus:

Each variant has its own Elo rating at all times. If the variant does not have 50 (just to use your example) matches, the matchmaker uses the chassis Elo rating for pairing purposes instead.

Each chassis has an Elo rating at all times. This rating is computed/maintained using all matches played while piloting any variant of this chassis. If the chassis does not have 50 matches, the matchmaker uses the weight class Elo rating for pairing purposes instead.

Each weight class has an Elo rating at all times. This rating is computed/maintained using all matches played while piloting any Mech in the weight class. If the weight class does not have 50 matches, the matchmaker uses the player Elo rating for pairing purposes instead.

Each player has an Elo rating at all times. This rating is computed/maintained using all matches played by that player, regardless of variant, chassis, or weight class. This is the default rating used by the matchmaker for pairing purposes when no more specific rating is appropriate. Eventually this rating would no longer be used by the matchmaker once the player had at least 50 matches in each weight class, but it would be useful until that point.

Note that since the various ratings could differ significantly, they could (probably should) adjust differently after each match.

#44 FupDup

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 26,888 posts
  • LocationThe Keeper of Memes

Posted 12 December 2013 - 10:40 AM

View PostBillyM, on 12 December 2013 - 07:43 AM, said:


Posted Image

No, not good. More like this:

Posted Image

Edited by FupDup, 12 December 2013 - 10:41 AM.


#45 D04S02B04

    Member

  • PipPipPipPipPip
  • 158 posts

Posted 12 December 2013 - 08:21 PM

EDIT: Alternatively, if you are solo PUGGING, then you're the token "Pro Player" to help pull up the Elo of the "Average" team... which happens very often. Either carry harder or drop with a premade (unfortunately...)

I'm going to explain how Elo MM works based on my observations dropping with various people of high and low Elo.

If we assume that:

1. Player Elo goes from 0 - 10
2. New players/weight class start at 3
3. Team has a total of 3 units/aggregated score of lance (for ease of calculation and illustration)

Why you get matched up against Lords / SJR is because your Elo isn't high enough.

If we assume Lords / SJR to have Elo of 10 (because they fail to find matches and we know they have a big gap between their Elo and the rest of the players)... This is what the match maker does.

If only players of ELO running from 3 to 6 are around when they form a premade and drop:

Drafting Lords/SJR/Pro Player Team
Lords/SJR/Pro Lance - 10
Noob Lance 1 - 3
Noob Lance 2 - 3
Total: 16

Drafting Average Players Team
Avg Lance 1 - 5
Avg Lance 2 - 6
Avg Lance 3 - 5
Total: 16

Now, that assumes if we manage to get equal Elo.

However we know that as match finding takes longer and at hours that not enough pro players are on, or searching for matches at the same time... you start to get weird match ups.

If Elo rating of 5 was the maximum.
Avg Players Team
Avg Lance 1 - 5
Avg Lance 2 - 5
Avg Lance 3 - 5
Total: 15

If Elo rating of 4 was the maximum
Avg Players Team
Avg Lance 1 - 4
Avg Lance 2 - 4
Avg Lance 3 - 4
Total: 12

You'll see that you're stuck in "mid range Elo Hell" actually.

The match maker is unable to get enough "highly skilled" opponents for the pro premades and throws what they can in there. Since you are slightly above the average, congratulations, you get thrown in.

However in Reality...

The pro players will mostly be ranked highly at the 80th or 90th percentile and not at the hundredth percentile because they still lose some matches from time to time (mostly due to base cap or sheer bad luck) and that dents their Elo severely because they are supposed to win that match. This means the players put on that team wouldn't be that nub.

You will also observe the further up you go in Elo, the more you're paired up with poor players to "balance" out the Elo depending on the amount of players active and searching for matches. On good play times when there are a lot of players, you get challenging matches and the matchmaker works great. On bad play times you'll want to pull your hair out while you watch Trial mechs run around in 3PV like headless chickens.

When you Elo gets high enough

You become the anomaly in the system and you start nub bashing and ride the PPC pop tart train. The best way for you to realise this is somehow, get yourself into a drop with 3 other top players and observe the massive differences in skill.

Some Screenshots to Illustrate

Simply look at the damage and the mech chassis/variants.

Win
http://flic.kr/p/ihMd8Z

Loss
http://flic.kr/p/ihMd3i

Edited by D04S02B04, 12 December 2013 - 08:25 PM.


#46 Bhael Fire

    Banned - Cheating

  • PipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 4,002 posts
  • Twitter: Link
  • Twitch: Link
  • LocationThe Outback wastes of planet Outreach.

Posted 12 December 2013 - 08:33 PM

I'm all for ANYTHING that makes the MM process more granular; a process that takes a lot more factors into consideration when matching players against each other would be fantastic.

#47 JimboFBX

    Member

  • PipPipPipPipPipPip
  • 345 posts

Posted 12 December 2013 - 09:12 PM

while I agree that each user's variant needs it's own rating, I don't think more elo is a solution. A ladder is not a good tool for multiplayer matchmaking

here's a hint: you need a second attribute called "confidence"

#48 Bhael Fire

    Banned - Cheating

  • PipPipPipPipPipPipPipPipPip
  • Ace Of Spades
  • Ace Of Spades
  • 4,002 posts
  • Twitter: Link
  • Twitch: Link
  • LocationThe Outback wastes of planet Outreach.

Posted 12 December 2013 - 09:15 PM

View PostJimboFBX, on 12 December 2013 - 09:12 PM, said:

I don't think more elo is a solution.


What is "more" Elo?

#49 Mech Wrench

    Member

  • PipPipPipPipPipPip
  • Philanthropist
  • Philanthropist
  • 222 posts
  • LocationAlaska

Posted 12 December 2013 - 09:28 PM

I like this homeless bill. just as much as i liked your idea how to deal with pin point cheese boat alpha when we saw ghost heat introduced. Your a pretty fart smeller. I'd like to see your idea's discussed on NGNG...

#50 RickySpanish

    Member

  • PipPipPipPipPipPipPipPipPip
  • Veteran Founder
  • Veteran Founder
  • 3,523 posts
  • LocationWubbing your comrades

Posted 12 December 2013 - 09:32 PM

I have a feeling that this might all be shaken up once CW is introduced in some meaningful way. If matches are dictated not only by Elo, but by your position in a 2D or 3D space (the Inner Sphere), which takes some time to travel across, the game may end up working very differently. At least I hope this is the case - it would be cool to have Merc units and Houses stake their claim ala Eve Online. Of course, that's assuming CW ever comes out, and from what I've seen of the way you are able to earn reputation for every faction (seriously wtf) theeeen stuff may end up still sucking.

#51 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 12 December 2013 - 09:40 PM

View PostHomeless Bill, on 11 December 2013 - 03:12 PM, said:

The Problem

So, I've come to a point where my assault Elo makes the majority of assault 'mechs I can play unfun. Almost without fail (and especially as it gets later), I'm paired with super angry players. SJR and Lords are all over the place, and a single mistake usually means death.

That's not the problem; in fact, whenever I'm in my 733C or 732, they're the only people I want to play against. I love trying to out-poptart the best snipers. Killing Jager, Siri, Villz, or Mav gives me a sense of satisfaction that can't be matched by the howls of 1000 dead scrubs.

The problem is when I trot out the Pretty Baby. Or any other Awesome. Or a brawler. Life instantly becomes ****** as I cower behind a rock, waiting for my chance to do some good. Two minutes later, six of my guys got smoked, and it's over.

I cannot, and will never be able to, affect matches to the same degree in my Pretty Baby as my 733C. Even my 733P running a cheese-free loadout isn't even in the same ballpark in terms of effectiveness.

This basically just makes me not play a lot of my 'mechs anymore (particularly Awesomes =[). Not only is it unfun to lose with near certainty, but I also don't want to crash my Elo down to where I don't get to play the fun people anymore.

The Solution

While I understand that splitting Elo ratings based on weight class is better than nothing, it's also not nearly as good or granular as it needs to be. Having said that, having a different Elo for each chassis or variant without any special work would be a matchmaking disaster since most people don't have a statistically significant number of matches (making it much more randomized than now). Here's my proposal for how to fix that:
  • Each variant has its own Elo rating. If a variant does not have a significant number of matches (probably around 50), it uses its base chassis' aggregate Elo rating.
  • Each chassis has an aggregate rating that is calculated by averaging all its variants' Elo ratings. If a chassis does not have a significant number of matches, it uses its weight class' aggregate rating.
  • Each weight class has an aggregate rating that is calculated by averaging all appropriate aggregate chassis Elo ratings.
  • Unplayed variants do not receive a rating until their first match, at which point they are assigned their parent tier's aggregate rating as a starting point.
  • Just like now, each weight class is assigned a default value for newbies (and thus the very first variant they play gets that default rating).
Because this is all pretty abstract, I'll try to paint a picture of how this would work. Some of this will probably be confusing, so if there's anything I can clarify, please let me know:
  • Poemless Bill signs up for an account and buys his first 'mech: an Awesome 8Q. I don't remember what new player Elo is currently, so I'll just use 1200 as the example.
  • Poemless Bill has a tough time adjusting to the game and scrapes to the end of the Basic efficiencies (25 matches) with an 1100 Elo for his AWS-8Q. Though he doesn't have 50 matches yet, his overall assault Elo rating is also at 1100 because that's the only rating there is to average - no other 'mechs have been played.
  • Bill buys another Awesome: the 9M. It receives 1100 as a starting Elo (the assault aggregate Elo). He ******* loves this 'mech, so he does better and pulls his Elo up to 1500 in his first 25 matches with that variant. Keep in mind that the matchmaker still isn't trying to match the 9M at its own rating - because there aren't statistically significant number of matches for that variant or chassis, it's using the assault aggregate Elo (which rose from 1100 to around 1300 through the course of playing the 9M).
  • Now that 50 matches have been played, the Awesome chassis has enough matches to have its own rating. Regardless of how much catastrophic failure ensues in any other assault 'mechs, the Awesome's Elo rating will not be affected.
  • Bill has been poptarted one too many times, and buys a 733C in a fit of rage. It gets assigned the assault aggregate Elo (1300), and then he proceeds to cheese his way through the ranks, ending up with a 2300 Elo rating at the end of 50 matches.
  • Though the matchmaker has been using the assault aggregate Elo for matches (rising from 1300 to 1800 as the 733C is played), the minute 50 matches has been reached for the 733C, its own rating is used. Because the matchmaker is now putting him in the 2300 range, he has a much harder time climbing through the ranks and can only get to 2400 after another 50 matches. Much like how the Awesome is locked-in an unaffected by all this, the 733C's rating will never be affected by any other chassis or variant after those first 50 matches.
  • Bill buys his final Awesome: the Pretty Baby. It is assigned the Awesome aggregate Elo (1300) since the chassis has enough matches to override the assault aggregate Elo. He regrets his purchase immediately and goes on an 18-hour drinking binge that results in a 500 Pretty Baby Elo after 150 matches. After the first 50 matches, the matchmaker stops using the Awesome aggregate Elo and prefers the Pretty Baby's specific Elo (though it will still use the Awesome aggregate Elo for the 8Q and 9M until they, too, have 50 matches).
  • His Awesome aggregate Elo is now around 900, and any future awesomes will be assigned that as default and use the aggregate for matchmaking until they hit 50 matches.
  • Any other Highlanders that are purchased will be assigned the Highlander aggregate Elo (currently 2400, since there are over 50 matches in the chassis and 733C is the only one owned). Let's say the 733P is next and it was ugly. Its first 100 matches bring the Highlander aggregate Elo down to 1700, which will then be bestowed on any future Highlanders. The 733P's inglorious slide has no effect on the 733C's rating.
  • Poemless Bill buys a Stalker 3F. Because no matches have been played in any Stalker, it is assigned the assault aggregate Elo (1300 based on Highlander's 1700 with 200 matches and Awesome's 900 with 200 matches).
It's a simple tree/hierarchy that, though it would take some effort to code, would really make the game a hell of a lot more balanced for the dedicated players. Sure, it's never going to be perfect (you could own four D-DCs, each with a different loadout), but it's a hell of a lot better than what we have now.


I'd greatly appreciate critical feedback. I intend to do a serious write-up later on, and I'd like to get all of the arguments and answers out on the table ahead of time.

TL;DR: I don't want to roll scrubs in my 733C, but I also don't want to die repeatedly to Siri in my 9M. Adding granular capability to the Elo rating system would solve this problem.


These are excellent ideas. I find myself in some similar situations. My Victor with 2xPPC/AC20 or 2xAC5s+ 2xPPCs does what it's supposed to do. So does my 733. I actually run a 733 so I can put 4xSSRMs on it for backup when some lights get close.

I want to play with my Battlemasters though. Currently my BLR 1S has a win/loss of 0.53. My 733 is 1.53. I hate that. It irritates the {Scrap} out of me. Because of the current meta though there is no way for me to viably play with my Battlemasters without being a crushing burden to my team. Stupidly I can do slightly less than humiliating with the BLR-1D by... you guessed it, 2xPPCs and 2xAC5s. Lean instead of jump. It's not terrible and surprisingly quick for an assault. It's no poptart, but it's not trash.

Entertainingly though you and me are living the nature of Elo actually being driven in a lot of ways by our behavior. CRAZY, I know.

Do you drop in premade teams much? Do you win more in premade or pugging?

The problem with a different Elo for each chassis though is how many matches it takes to seat Elo. There just isn't an accurate way to do it faster than 100-300 matches, depending on your skill level. That's a {Scrap} ton of matches.

The real fix, honestly? Make PPCs and ACs do damage over time. Maybe 0.3 to 0.5 seconds. Fast enough to put it pretty much all on a location unless someone is sprinting past, long enough to make poptarting no longer pinpoint. Still viable, just not pinpoint. Leave Gauss and missiles the way they are but remove charge-up time for Gauss. No poptarts with more than 1 Gauss really possible and it's a heavy, clumsy, explosive weapon.

That would do far more to fix some of the issues we're associating with Elo than any Elo fix. The problem is less Elo and more the nature of the peak meta. It turns the whole game experience stale and invalidates everything else. You start to push things into DOT vs pinpoint and you change how everything from concentration of fire to positioning to how different builds work. It will put some 'rock/paper/scissors' back in the game as lasers become far more viable, as do missiles - even with their screwy hit detection.

#52 80sGlamRockSensation David Bowie

    Member

  • PipPipPipPipPipPipPipPipPip
  • The People's Hero
  • The People
  • 4,001 posts
  • LocationThe Island

Posted 12 December 2013 - 10:08 PM

The worse part is, 4/5 matches I end up failing the MM when I solo in my Heavies.

It kinda sucks when you get to the point where you literally are unable to play the game because of a game feature.

#53 Deathlike

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPipPipPip
  • Littlest Helper
  • Littlest Helper
  • 29,240 posts
  • Location#NOToTaterBalance #BadBalanceOverlordIsBad

Posted 12 December 2013 - 11:54 PM

View Postmwhighlander, on 12 December 2013 - 10:08 PM, said:

The worse part is, 4/5 matches I end up failing the MM when I solo in my Heavies.

It kinda sucks when you get to the point where you literally are unable to play the game because of a game feature.


If it's not a "bug", it's a "feature".

Resigned to fate by the MM. gg close





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users