Jump to content

Elo Worthless


298 replies to this topic

#61 nehebkau

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,386 posts
  • LocationIn a water-rights dispute with a Beaver

Posted 12 November 2013 - 12:50 PM

The problem is how ELO handles groups and it should be tweaked depending on the size of the group.

Now, we all know the saying, that the sum is greater than the total of it's parts. We use that saying because it is, more often than not, true. Applying this to the game, a group is better than its individual parts so why not account for that in ELO?

If we have two players, who decided to group together, and each of them have 100 (pulled from my *** cause i like nice small easy to type numbers) ELO rating, wouldn't it be more in-line to expect them to jointly act as if they had a higher ELO than the 200 they bring to the game?
Solo
ELO <Player A> + ELO <Player B> = 100 + 100 = 200 (combined ELO)
Grouped
ELO<Player A> + ELO <Player B> + group adjustment = 100 + 100 + 50 = 250 (Adjusted ELO)

Figuring out how to best weight group size would require some consideration but I think it would probably follow a y = 2^x curve in regards to the effect of grouping and performance.

[Redacted]

Edited by Niko Snow, 12 November 2013 - 01:17 PM.


#62 arghmace

    Member

  • PipPipPipPipPipPipPip
  • Elite Founder
  • Elite Founder
  • 845 posts
  • LocationFinland

Posted 12 November 2013 - 01:04 PM

There's something badly wrong with MM for sure. It seems that every other day you get in the winning team and every other day winning is an impossibility. Just today I played 3 hours straight with a 4 man premade and we lost just about every battle. Wasn't because we sucked since in pretty much every battle we were the highest scoring players. Seems that some evenings the match maker just decides to put your premade along with 8 noobs while the other team has 3x4 experienced players.

#63 nehebkau

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,386 posts
  • LocationIn a water-rights dispute with a Beaver

Posted 12 November 2013 - 01:19 PM

View PostNiko Snow, on 12 November 2013 - 01:17 PM, said:

Hey all,

Appreciate the feedback on ELO, but please do refrain from being too tongue in cheek in your replies. Tone in writing can throw us for a loop when trying to build feedback, only to realize that I've been bel-air'd.


It was only the last sentence where my tongue firmly stuck in my cheek. :) ;) BTW what does bel-air'd mean? (Why yes I am ancient.)

Edited by nehebkau, 12 November 2013 - 01:32 PM.


#64 Sarsaparilla Kid

    Member

  • PipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 664 posts
  • LocationGold Country

Posted 12 November 2013 - 02:36 PM

View Postnehebkau, on 12 November 2013 - 01:19 PM, said:


It was only the last sentence where my tongue firmly stuck in my cheek. :) ;) BTW what does bel-air'd mean? (Why yes I am ancient.)


From the Urban Dictionary: bel-aired -"hooking a reader on to a particular story then replacing the climactic part with lyrics from the "Fresh Prince of Bel-Air" theme song." or "Used to refer to the act of following a video link purportedly offering proof of some act, only to find that the video in question is the opening credits to the 90's tv show "Fresh Prince of Bel Air""

#65 80sGlamRockSensation David Bowie

    Member

  • PipPipPipPipPipPipPipPipPip
  • Veteran Founder
  • Veteran Founder
  • 3,994 posts
  • LocationThe Island

Posted 12 November 2013 - 02:41 PM

<Insert Morpheous Meme>

What if I told you, my Elo is quite low, but W/L is still over 5 to 1?



Elo doesn't work and has never really worked well.

#66 Abivard

    Member

  • PipPipPipPipPipPipPipPip
  • Shredder
  • 1,935 posts
  • LocationFree Rasalhague Republic

Posted 12 November 2013 - 02:42 PM

There is an old saying;

'you cant stiffen a bucket of spit with a handful of steel shot'


But that is what PGI is trying to do with their silly ELO

#67 Nick Makiaveli

    Member

  • PipPipPipPipPipPipPipPipPip
  • Bridesmaid
  • Bridesmaid
  • 2,188 posts
  • LocationKnee deep in mechdrek

Posted 12 November 2013 - 03:08 PM

View PostFooooo, on 12 November 2013 - 03:54 AM, said:



I got something totally different.

Faction players = Fighting on moving fronts. (think heavy gears online play with North VS South)
Faction Players (PGI wishlist) = Merc style planet takeovers.

Lone Wolf = Fighting on moving fronts for whatever faction you align to for those battles, or just a random faction each battle. Also can be filler for merc corps needing an extra player. (although Im not sure how well that would work....most merc corps will have full teams and not really be a player down......)

Merc Player = Fighting on the periperary for planets. Full planet conquest is available.



The Periphery wasn't mentioned that I recall. The missing piece is Houses hiring large merc units to hold planets for them. Or giving them control (including a share of the profits/materials) of a world as part of their contracts. Remember a large unit is going to have a large support unit, which means lots of civilians, especially once you toss in families.

So yes Merc units will be taking planets, and Faction players will be helping etc.

I think all of this sounded good on paper, then they realized that balancing with weight limits and ELO meant it would be hard to actually take a planet. Also, not using weight limits means that people would drop with all Assaults in Assault mode to farm wins.

I think it's going to settle down to where there are battles for a given world and all fights count, and the merc unit with the best record on the winning side gets control of the captured world.

That or just a declared set of fights, ie this week we have Davion invading <insert Liao held world>, while Steiner is contesting.... and Faction players will have their wins count on that fight, Mercs will take contracts for given fights, and Lone Wolves will just drop and be inserted into random fights.

#68 Tahribator

    Member

  • PipPipPipPipPipPipPipPip
  • Fire
  • Fire
  • 1,565 posts

Posted 12 November 2013 - 03:22 PM

My biggest problem is the matchmaker totally botches it when premades are involved. Some premades drop with 4 assaults, some with lights and the matchmaker completely ignores the premade composition. In the end we get these totally unbalanced tonnage matches.

Playing in the US TZ and EU TZ is very different. In US TZ you have many premades leading to unbalanced matches with constant steamrolls, while EU TZ has little to none premades which makes for more balanced matches.

Edited by Tahribator, 12 November 2013 - 03:40 PM.


#69 Blurry

    Member

  • PipPipPipPipPipPip
  • 382 posts
  • LocationGreat White North

Posted 12 November 2013 - 04:45 PM

tongue ripped out of the skull due to horrible matchmaking. Or it shoved through the cheek.

all straight pugstomps in a row tonight. You know you are in trouble when the other team is filled and almost ready to go.
12-1 isnt fun. where the top score is 30. With things this bad is there anything even matchmaking other than a good team set up and a group of 12 solo picked out? That is exactly what is was the past 6 or so games.

I solo so I dont mind getting killed or losing on a streak - I know the match maker loves streaks but the stomps are grinding.
It is so disenfranchising game after game to get stomped. Then to turn around and stomp.

You dont learn from that either because one minute you cant stay alive long the next you are dishing out a lot of damage.
All I can say is whatever it is doing now isnt fun and it isnt working at least for me.

It is killing the game for me. That and the freezing.

#70 Roadbeer

    Member

  • PipPipPipPipPipPipPipPipPipPip
  • Elite Founder
  • 8,160 posts
  • LocationWazan, Zion Cluster

Posted 12 November 2013 - 04:52 PM

View PostBlurry, on 12 November 2013 - 04:45 PM, said:

You know you are in trouble when the other team is filled and almost ready to go.
12-1 isnt fun. where the top score is 30. With things this bad is there anything even matchmaking other than a good team set up and a group of 12 solo picked out? That is exactly what is was the past 6 or so games.


I'm not mocking you, I'm actually curious.

How exactly do you think matchmaking works in relation to Elo, Premade groups, weight, and what you're looking at on the Red Team line up, as well as your own team.

I truly want to know, because it appears there are quite a few misconceptions with new players in regards to how matching works, or doesn't, whichever.

#71 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 12 November 2013 - 05:13 PM

View PostNiko Snow, on 12 November 2013 - 01:17 PM, said:

Hey all,

Appreciate the feedback on ELO, but please do refrain from being too tongue in cheek in your replies. Tone in writing can throw us for a loop when trying to build feedback, only to realize that I've been bel-air'd.


Best feedback I can give -

Please give us more data to see on our own win/loss breakdown. Human psychology being what it is having the affirmation that even though we lost the last 3 matches we've won the 7 before that and while we may think we're 'getting terrible matches' we're just tricking ourselves.

While I get the logic of wanting to keep Elo under wraps for now it would be good to see some Elo-related matches. % of matches we win vs lose against Elo prediction as in how much we're improving. That'd be more reliable than trying to divine from pig intestines and estimated win/loss rates in a given weight class if we're doing better or worse.

Stats. They taste great and are less filling, we can gobble them up all day! Very low calorie. You're not going to give us so many data points that we'll get fat and lazy and refuse to play anymore, I promise.

#72 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 12 November 2013 - 05:16 PM

View PostRandalf Yorgen, on 11 November 2013 - 09:27 PM, said:


Track all the stats you want it's not going to change the fact, and yes I say fact because time and time and time and time again the numbers will say one thing but reality is something very different. You can hold up all the spreadsheets in the world and praise how something is but at the other end of it it's a very different story.


I'm not sure the word 'fact' means what you think it does.

If you have a fact that shows that the matchmaker isn't actually more accurate than random skill matching + weight balancing please present it. All the statistics currently available plus probability theory itself say otherwise but I'm perfectly willing to accept that I'm wrong when faced with evidence to the contrary.

See, you can't trust your emotional impression of how something works. It's stunningly unreliable. Show some data proving your point. I can and have thrown up data time and again to support mine, I'm still looking for someone to show data points showing the opposite, something other than some generalizations on how things feel.

#73 Grits N Gravy

    Member

  • PipPipPipPipPipPip
  • 287 posts

Posted 13 November 2013 - 12:19 PM

Elo works better for people who primarily run in 4 man groups. It works decently if you're a mediocre and below skilled solo dropper. It's painful if you're an above average skilled solo dropper.

If you run primarily 4 mans, you will get to a point where you can expect to run into mostly teams with other groups of 4 and the occasional pug. Elo works well here because it keeps groups of 4 out of to many PUG stomps. This is the primary reason why people reported better matchmaking after Elo implementation. Elo works decently if you're mediocre and below because you see a lot less of groups of coordinated players. These two groups will tend to run more stable results and not be as streaky. Over a given period of time the standard deviation of their Elo score will tend to be smaller.

Skilled players dropping solo will tend to run more streaky. The range of their Elo scores over a period of time will have a larger standard deviation. The top range of their possible matches, is the lower match range of the 4 man sets. Since this area of convergence has less of the 4 man groups in it, only one team ends up with a coordinated 4 man.

Thus the skilled solo dropper ends up loosing a bunch more up here and yet ends up with the same Elo score. Losses to a higher ranked opponents does not drive Elo scores rapidly, even with a K factor of 50. It is possible that a skilled solo players end up getting beaten 6 times, then winning 4 and arrives at the same Elo as where he started. And that's conservative, it possible to maintain an Elo score while only winning 33% of your games. This is what I call Elo hell, getting stuck in a cycle where 1/3 keeps you in same place, going 1/3 over and over.

Elo hell is a function of wide variance in matchmaking parameters, IE allowing two of vastly different Elo scores to play each other and large K factors. The issue can largely be mitigated in two ways. Switching from the logistic formulation of Elo, that MWO uses, to the Gaussian formulation and tightening the K factor. The result will be greater population density at scores within 1 standard deviation of the mean. Which will allow you to tighten your match making criteria, without increasing wait times.
Posted Image

Edited by Grits N Gravy, 13 November 2013 - 12:50 PM.


#74 Mudhutwarrior

    Member

  • PipPipPipPipPipPipPipPipPip
  • The 1 Percent
  • The 1 Percent
  • 4,183 posts
  • LocationThe perimieter, out here there are no stars.

Posted 13 November 2013 - 01:03 PM

I have to agree the diference between the EU times and US is notable. Been stuck past few days having to play at peak premade times and its one non stop roll after another. On top of that in the past week its same map over and over four or five times in a row. I did canyon assault 9 times in row on tuesday. Its just crazy.
Elo does not work at all it seems and I know guys want to blame pugs but in all the matches the past few days only once did the premades try to communicate with the rest of us. PGI cant fix that past in game voip. It would be up to you teams to do it.

#75 Mudhutwarrior

    Member

  • PipPipPipPipPipPipPipPipPip
  • The 1 Percent
  • The 1 Percent
  • 4,183 posts
  • LocationThe perimieter, out here there are no stars.

Posted 13 November 2013 - 01:06 PM

View PostGrits N Gravy, on 13 November 2013 - 12:19 PM, said:

Elo works better for people who primarily run in 4 man groups. It works decently if you're a mediocre and below skilled solo dropper. It's painful if you're an above average skilled solo dropper.

If you run primarily 4 mans, you will get to a point where you can expect to run into mostly teams with other groups of 4 and the occasional pug. Elo works well here because it keeps groups of 4 out of to many PUG stomps. This is the primary reason why people reported better matchmaking after Elo implementation. Elo works decently if you're mediocre and below because you see a lot less of groups of coordinated players. These two groups will tend to run more stable results and not be as streaky. Over a given period of time the standard deviation of their Elo score will tend to be smaller.

Skilled players dropping solo will tend to run more streaky. The range of their Elo scores over a period of time will have a larger standard deviation. The top range of their possible matches, is the lower match range of the 4 man sets. Since this area of convergence has less of the 4 man groups in it, only one team ends up with a coordinated 4 man.

Thus the skilled solo dropper ends up loosing a bunch more up here and yet ends up with the same Elo score. Losses to a higher ranked opponents does not drive Elo scores rapidly, even with a K factor of 50. It is possible that a skilled solo players end up getting beaten 6 times, then winning 4 and arrives at the same Elo as where he started. And that's conservative, it possible to maintain an Elo score while only winning 33% of your games. This is what I call Elo hell, getting stuck in a cycle where 1/3 keeps you in same place, going 1/3 over and over.

Elo hell is a function of wide variance in matchmaking parameters, IE allowing two of vastly different Elo scores to play each other and large K factors. The issue can largely be mitigated in two ways. Switching from the logistic formulation of Elo, that MWO uses, to the Gaussian formulation and tightening the K factor. The result will be greater population density at scores within 1 standard deviation of the mean. Which will allow you to tighten your match making criteria, without increasing wait times.
Posted Image



ELO works best for me? I am sure not above average and I pug only so I don't know where your obsevation comes from. I get stomped consitently and by very good premades. My kd is .74 so explain that one please.

#76 Malino

    Rookie

  • 7 posts

Posted 13 November 2013 - 01:16 PM

http://postimg.org/image/kzwbwjg4h/


I'm talking about weight imbalance making ELO matching worthless, example here: 890 tonnes -v- 665 tonnes.

I have plenty more examples if needed,

#77 Nick Makiaveli

    Member

  • PipPipPipPipPipPipPipPipPip
  • Bridesmaid
  • Bridesmaid
  • 2,188 posts
  • LocationKnee deep in mechdrek

Posted 13 November 2013 - 01:17 PM

View PostMudhutwarrior, on 13 November 2013 - 01:06 PM, said:



ELO works best for me? I am sure not above average and I pug only so I don't know where your obsevation comes from. I get stomped consitently and by very good premades. My kd is .74 so explain that one please.


Please explain what you think KDR has to do with MWO's version of ELO? Winning and losing matter, not your personal KDR.

Again with the evil premades. You really have issues with losing don't you? You say you are "not above average" yet it's their fault you lose?

#78 Thejuggla

    Member

  • PipPipPipPipPipPip
  • 301 posts

Posted 13 November 2013 - 01:24 PM

It's not perfect but since its been added I don't run into people driving straight into me at full throttle shooting point blank because they can't hit me otherwise.

#79 MischiefSC

    Member

  • PipPipPipPipPipPipPipPipPipPipPipPip
  • The Benefactor
  • The Benefactor
  • 16,697 posts

Posted 13 November 2013 - 01:28 PM

View PostGrits N Gravy, on 13 November 2013 - 12:19 PM, said:

Elo works better for people who primarily run in 4 man groups. It works decently if you're a mediocre and below skilled solo dropper. It's painful if you're an above average skilled solo dropper.

If you run primarily 4 mans, you will get to a point where you can expect to run into mostly teams with other groups of 4 and the occasional pug. Elo works well here because it keeps groups of 4 out of to many PUG stomps. This is the primary reason why people reported better matchmaking after Elo implementation. Elo works decently if you're mediocre and below because you see a lot less of groups of coordinated players. These two groups will tend to run more stable results and not be as streaky. Over a given period of time the standard deviation of their Elo score will tend to be smaller.

Skilled players dropping solo will tend to run more streaky. The range of their Elo scores over a period of time will have a larger standard deviation. The top range of their possible matches, is the lower match range of the 4 man sets. Since this area of convergence has less of the 4 man groups in it, only one team ends up with a coordinated 4 man.

Thus the skilled solo dropper ends up loosing a bunch more up here and yet ends up with the same Elo score. Losses to a higher ranked opponents does not drive Elo scores rapidly, even with a K factor of 50. It is possible that a skilled solo players end up getting beaten 6 times, then winning 4 and arrives at the same Elo as where he started. And that's conservative, it possible to maintain an Elo score while only winning 33% of your games. This is what I call Elo hell, getting stuck in a cycle where 1/3 keeps you in same place, going 1/3 over and over.

Elo hell is a function of wide variance in matchmaking parameters, IE allowing two of vastly different Elo scores to play each other and large K factors. The issue can largely be mitigated in two ways. Switching from the logistic formulation of Elo, that MWO uses, to the Gaussian formulation and tightening the K factor. The result will be greater population density at scores within 1 standard deviation of the mean. Which will allow you to tighten your match making criteria, without increasing wait times.
Posted Image


Okay. I am game with that. I would also strongly, STRONGLY recommend splitting premade and pug Elo. There's just no way you can mix the two without killing variance. It's not just about the 4man player who is pugging being thrown into a match he's not going to win - it's that he's effectively throwing off the prediction for his whole teams balance and skewing Elo for the whole match. You get 3 or 4 solid premade players who are pugging in a match and you could have a margin of error for the total value of their team that could be off by 200 points or more.

For the predictions Elo does to be accurate it needs to recognize the difference in performance between a player working in a premade team vs working as a pug. The absolute worse case scenario is that people who play in premades a lot will have to play more pug matches to get their pug Elo settled correctly. Their Premade Elo, which is an aggregate, will only be as significant as it relates to their % fraction of the team i.e. 25% relevant for a 4man team, thus buffering any disadvantage it may generate for them only generating premade Elo when dropping in a premade team.

This would be largely transparent to premade players aside from making their pugging less onerous and creating less variance for the matchmaker for everyone by preventing poor estimates for a players value to his team based on premade success that doesn't translate into pug performance.

I would also disagree that it's a problem for skilled pugs. I find that in the weights where I've got a 1.3 or better win/loss while pugging I have pretty good games - most of my teammates are premades and truly 'challenged' people are pretty rare. You just follow whoever Alpha lance is, they probably have a plan and do your best to contribute.

Where it becomes hellish is when you get into the lower end of the 'mostly premade' spectrum - because those players pug too and when they do they're dropped as a pug in a match full of very predatory players and many premade teams treat pugs like meat shields. It's bitter irony but end of the day not good for the match maker.

Edited to add -

I also absolutely don't get how this wouldn't extend weight times. There is a limit to the number of players hitting 'launch' within any 120 second interval. The tighter you make the criteria for finding that match the less likely you are to hit your goals. The three search criteria are

1. Match Elo by band, as you discussed. The less deviation between Elo scores among all players the better.

2. In the absence of 1 it matches total team values by widening the bands width, low with high and such to reach a common value.

3. Tonnage between all mechs on both teams.

3 severely impacts 1s ability to find 24 matches within 120 seconds. They've said as much. That's why 2 exists as a criteria, since it's better than nothing at all. How are you proposing that tightening 1s criteria and reducing 2s implementation without considerably widening 3 won't extend search times or more to the point cause the 'drop however you can' trigger that hits at 120 seconds?

I'd also understood that the k-value changed from 50 to 5 a while ago. We started at 50 because nobody had any Elo data to speak of and it gave depth to the pool, accurate or not, from which to begin sorting ranks without creating too gradual of a curve. I could be wrong though. If so I absolutely agree that 5 is a way, way better k-value than 50.

Edited by MischiefSC, 13 November 2013 - 01:40 PM.


#80 nehebkau

    Member

  • PipPipPipPipPipPipPipPipPip
  • 3,386 posts
  • LocationIn a water-rights dispute with a Beaver

Posted 13 November 2013 - 01:32 PM

We don't need to go into all the maths <fecal matter> in regards to this issue. I'ts simple, if you are joining as a group your total ELO should be higher than your individual ELO. The bigger the group the higher that is. This would also account for the MUCH MUCH MUCH higher use of in-game VOIP by pre-made groups over pugs.

The question being, how much better are 4 people who are on voip, have compatible mechs and have an understanding of each others play-style than 4 strangers, using chat with incompatible mechs? I'd say 50% better by thumbnailing it. How about 8 players? I'd say still, 50% better. I'd even go so far as to say 2 are still 50% better.

Shouldn't be too hard manage in code.





6 user(s) are reading this topic

0 members, 6 guests, 0 anonymous users