Jump to content

Paging Karl Berg...karl Berg, Please Pick Up The White Courtesy Phone...


1911 replies to this topic

#101 Modo44

    Member

  • PipPipPipPipPipPipPipPipPip
  • Bad Company
  • 3,559 posts

Posted 09 April 2014 - 09:23 PM

From solo drops experience: Elo may be close for teams because the matches are very random. Some map sides have clear advantages, and dropping solo puts you at the mercy of your premade(s). This has to block movements away from the average for the best and worst players.

Also, when referring to Elo, "close" is "within 100", not "within 500+" like Paul spilled a while back. A 500 points difference in 1 on 1 is basically a free win, so imagine what happens in 12 on 12.

Edited by Modo44, 09 April 2014 - 09:24 PM.


#102 Karl Berg

    Technical Director

  • 497 posts
  • LocationVancouver

Posted 09 April 2014 - 09:49 PM

True, a 100 average difference is somewhat large. The lower ranked team has only about a 36% chance expected to win, so were they to play multiple games, they should win a little better than one out of every three games or so. A 500 Elo delta is catastrophic, the lower ranked team only has about a 5% expected chance to win. Unfortunately I'm not at work right now, so I don't have access to our live data, but I recall the average delta being consistently less than 100.

As for the rest of your points, we have exact data on how often the base A team wins over the base B team. Again, I don't have that data in front of me, but I remember being a little disappointed that there wasn't more of a one-sided skew in the results. I recall there only being a very low single percentage point skew. Those numbers were based off hundreds of thousands of games, so the results are hard to dispute. I was expecting a much more lopsided result on Alpine in particular.

I'm afraid I don't understand the rest of your first point however; about highly random matches and blocking movements. Perhaps you could explain in a different manner?

#103 Modo44

    Member

  • PipPipPipPipPipPipPipPipPip
  • Bad Company
  • 3,559 posts

Posted 09 April 2014 - 09:57 PM

In bad matches, one of this happens:

1. The premade on my team is so good, I could just as well sit back and watch them win.
2. The premade on my team is so bad, I can try and pull a 100 points game, and still lose.

Those are not all matches, but it is a regular occurance. This makes my efforts as a solo player feel irrelevant as far as my rating goes.

Edited by Modo44, 09 April 2014 - 09:57 PM.


#104 p4r4g0n

    Member

  • PipPipPipPipPipPipPipPip
  • Knight Errant
  • 1,511 posts
  • LocationMalaysia

Posted 09 April 2014 - 10:12 PM

View PostKarl Berg, on 09 April 2014 - 09:49 PM, said:

-snip-

As for the rest of your points, we have exact data on how often the base A team wins over the base B team. Again, I don't have that data in front of me, but I remember being a little disappointed that there wasn't more of a one-sided skew in the results. I recall there only being a very low single percentage point skew. Those numbers were based off hundreds of thousands of games, so the results are hard to dispute. I was expecting a much more lopsided result on Alpine in particular.

-snip-


Could you clarify what you meant in the bolded section of the quote?

Modo44 makes a good point although I personally felt it to be more of an issue pre-Elo where it was pretty much launch and pray you get:

1. A group on your side and none on theirs
2. No 8 man group on the other side
3. A better group on your side than the op4

These days it is much less of an issue although it occasionally happens when I drop into a match and see competitive players in Alpha lance on one side and none on the other. This is rare for me so I'm guessing Modo44 plays in a higher Elo bracket than I do :)

#105 Karl Berg

    Technical Director

  • 497 posts
  • LocationVancouver

Posted 09 April 2014 - 10:13 PM

View PostKageru Ikazuchi, on 09 April 2014 - 07:41 PM, said:

Someone might have been paying attention to my feedback in the various group size feedback threads ... or not ...

If not, > here < is the most recent ... and it even could help solve the problem mentioned here ...


I've read it over, and most of your points seem solid. It's very hard to have certainty here, that matchmaker ultimately kicks off several tens of thousands of games a day under all sorts of crazy input conditions. That's why I'd really consider Elo penalties for groups; we can use our telemetry to *prove* this will result in closer matches. For example, we can test the following, *if* groups of 2 win an average of 60% of the time, a 70 point Elo penalty will remove that benefit. Anyways, with luck we'll implement some of these ideas soon and improve group play for all.

#106 Karl Berg

    Technical Director

  • 497 posts
  • LocationVancouver

Posted 09 April 2014 - 10:23 PM

View PostModo44, on 09 April 2014 - 09:57 PM, said:

In bad matches, one of this happens:

1. The premade on my team is so good, I could just as well sit back and watch them win.
2. The premade on my team is so bad, I can try and pull a 100 points game, and still lose.

Those are not all matches, but it is a regular occurance. This makes my efforts as a solo player feel irrelevant as far as my rating goes.


Hrm.. that's a good point if true. It would mean the added communications and teamwork benefits conferred to a group can't be easily expressed in terms of group size alone. In that case, unfortunately, something more invasive might be required to adequately solve this problem.

@p4r4g0n: I recall clearly when we last did this particular data-mining operation. I was myself convinced that the lower-base team on Alpine enjoyed a considerable positional advantage based on our 8-man drops, so I was very much ready to run over to the level designers and go 'HAA!! SEE!'. Cruelly it was not meant to be. The stats showed the upper base team won basically just as often as it lost, to within 1% or so. I'm no statistician, so I won't try and tell you what the confidence interval on that was, but considering the sample size it has to be pretty darn good.

#107 Tekadept

    Member

  • PipPipPipPipPipPipPipPip
  • Knight Errant
  • 1,290 posts
  • LocationPerth, Australia

Posted 09 April 2014 - 10:24 PM

Just wondering How your ELO telemetry fares across different timezones?

I am GMT+8, and back when I used to play (this included the latest matchmaker tweak, I stopped after Cockpit glass) usually around 9-12pm most nights running a 2-4 group it felt like we were clubbing seals, the games were not well balanced, and rarely would you get a "challenging" match. Was the ELO failing? was their insufficient people playing in this timezone to round up a fair balanced match? have all the old timers moved on with their life, and the players playing around that time were all new?

Be interesting to know if you factor in different timezones, and how number of players across them, and hell possibly even peoples different playstyles due to culture could even be a factor are affected in decisions and telemetry gathering/datamining, or is everything focused on a "worldwide" view or a US primetime view.

I can guarantee there is a different experiences, a few times on holidays I played in US primetime, and lets just the quality of dezgra in those matches amazed me, I read this posts about people ranting in game chat insulting ppl etc, yet I had never seen it. Play US primetime. BAMMO there it is. And the skill level seemed a helluva lot difference, lots of people firing medium lasers as sniping weapons at 800metres LOL

Edited by Tekadept, 09 April 2014 - 10:30 PM.


#108 Karl Berg

    Technical Director

  • 497 posts
  • LocationVancouver

Posted 09 April 2014 - 10:36 PM

Well, our online user count is highly lumpy, with seasonal, weekly, and daily lumps. We have two daily bumps corresponding to European evening and North American evening. Based on that alone, I would highly suspect that match quality varies depending on time of day. That's almost a necessarily true statement in fact. The magnitude by which it varies is a more interesting question. In theory, the first outcome you'd notice should be an increase in wait times, it takes a while for the skill loosening to meaningfully increase, as it currently follows a very non-linear curve. That increase in wait times should for the most part be compensating for the reduction in player count, and keep skill matching close. It was designed to do this after all. Our telemetry and data-mining bundles all that data together, across all time zones of course. We don't exclude any regions or time periods from our analytics.

#109 Chronojam

    Member

  • PipPipPipPipPipPipPipPipPip
  • 2,185 posts

Posted 09 April 2014 - 11:03 PM

View PostKarl Berg, on 09 April 2014 - 09:07 PM, said:

Chronojam, I've got about 2,000 words written up in response to your post. I'm having it reviewed right now to ensure I don't accidentally let slip unannounced future plans or violate any agreements we're bound by. I'll post it here as soon as I can.

For now, you have my thanks for taking your time to write all that up. It really does help me judge where some of our more critical failures have been, and where to focus on for future improvements.


I appreciate you taking the time to read it, and making the rare (so far) attempt to actually have a little back-and-forth with the players. To be honest, despite the improvements to the Ask The Devs videos, I was really feeling like the whole "Ask" portion had been left by the wayside -- It's not very clear who gets to ask the devs or where the asking ought to be done anymore.

A frank discussion is a really cool change of pace and is something that could have avoided a lot of the discontent that's built up. I don't think I'm using too-strong language when I say there are major trust issues at this point, even among a lot of players we once called the "white knights." It could be that there were 100% perfect legitimate reasons for (as seen by players) bizarre actions that have been taken, but these issues are rarely communicated, often communicated improperly or taken as condescending, etc.

Obviously PGI cannot address every player's issue, but there have been many times that an announcement has been met with no less than 200 pages of discontent without a peep from PGI or even a "I will let the guys upstairs know about this" from the CM or mod team. Letting that kind of thing sit for days, weeks, or months is a recipe for a very cynical community.

#110 p4r4g0n

    Member

  • PipPipPipPipPipPipPipPip
  • Knight Errant
  • 1,511 posts
  • LocationMalaysia

Posted 09 April 2014 - 11:14 PM

View PostKarl Berg, on 09 April 2014 - 10:23 PM, said:

-snip-@p4r4g0n: I recall clearly when we last did this particular data-mining operation. I was myself convinced that the lower-base team on Alpine enjoyed a considerable positional advantage based on our 8-man drops, so I was very much ready to run over to the level designers and go 'HAA!! SEE!'. Cruelly it was not meant to be. The stats showed the upper base team won basically just as often as it lost, to within 1% or so. I'm no statistician, so I won't try and tell you what the confidence interval on that was, but considering the sample size it has to be pretty darn good.


Thanks for the clarification and I take your point that you were looking at skew from an imbalanced map perspective.

Was the Alpine Peaks analysis broken down by Elo brackets, time zones and team composition (i.e. how many groups, solos and mechs used in winning / losing teams) or was the analysis less detailed than that? I suspect your expectation of map imbalance was generally due to testing with pretty balanced teams and we all know the playerbase can also be a little more inventive than in-house testing expects :)

Personally, I find most maps show a fair bit of thought in design. The fact that most matches end up with one or two generic approaches is due mainly to player experience, communications or lack thereof (difficulty in being flexible in approach due to lack of even a basic effective means of communication e.g. text chat macros, comms rose).

Alpine actually has several viable approaches if you spawn in lower. Unfortunately, using anything other than rush H9/I9 requires a higher level of coordination (i.e. communication) than you will usually get in the restricted pub queue. The few times that I've seen an alternative approach work is when most of the team starts following one of the more experienced players using the alternate routes. It also depends on the rest of the team understanding what to do using that alternative approach (i.e. experience).

When you rush H9 / I9 from lower spawn, success depends on speed and the ability to get enough firepower close enough to drive off the few OP4 that reach the peak first from upper spawn. Once that happens and the rest of your team follows through, a win is highly probable.

In other words, the number of variables in the unrestricted public queue can distort any analysis of a map's balance based on averages alone. I suspect though that the 1% variance you mentioned is in favour of upper spawn.

P.S. Thanks for taking the time to participate in this thread. As always, your posts and the manner in which they are written are much appreciated.

Edit: Facepalms self for lack of reading comprehension re: lower vs upper. Surprised you felt lower spawn had the advantage though.

View PostTekadept, on 09 April 2014 - 10:24 PM, said:

-snip-
I can guarantee there is a different experiences, a few times on holidays I played in US primetime, and lets just the quality of dezgra in those matches amazed me, I read this posts about people ranting in game chat insulting ppl etc, yet I had never seen it. Play US primetime. BAMMO there it is. And the skill level seemed a helluva lot difference, lots of people firing medium lasers as sniping weapons at 800metres LOL


QFT :o

Edited by p4r4g0n, 10 April 2014 - 06:48 AM.


#111 Modo44

    Member

  • PipPipPipPipPipPipPipPipPip
  • Bad Company
  • 3,559 posts

Posted 09 April 2014 - 11:23 PM

Data will show perfect averages on everything when it is all nearly perfectly random, with just some Elo in the mix. Win badly or lose easily? Hey, as long as it happens half the time, the result is a perfect 50% W/L ratio, with most people squarely in the middle of the skill curve, and the map side does not matter. Solid work on paper, except it generates terrible frustration in actual matches. I would say that was just my cynical view, but there are recurring new player posts describing the same issues.

#112 Karl Berg

    Technical Director

  • 497 posts
  • LocationVancouver

Posted 10 April 2014 - 12:01 AM

View PostModo44, on 09 April 2014 - 11:23 PM, said:

Data will show perfect averages on everything when it is all nearly perfectly random, with just some Elo in the mix. Win badly or lose easily? Hey, as long as it happens half the time, the result is a perfect 50% W/L ratio, with most people squarely in the middle of the skill curve, and the map side does not matter. Solid work on paper, except it generates terrible frustration in actual matches. I would say that was just my cynical view, but there are recurring new player posts describing the same issues.


Yes, this is true. Luckily we did not stop there. We also graphed, for each difference in Elo between teams on production, the percentage of wins versus the percentage of wins predicted by Elo. The graph looked something like this:

                    _____
1.0 |          ____/
0.9 |      ___/
0.8 |   __/
0.7 | _/
0.6 |/
0.5 |\_+++
0.4 |  \__++++  + +
0.3 |     \___++ + +++ +
0.2 |         \____ + + +
0.1 |              \_____+
0.0 |________________________
    0     200    400    600


Sorry for the poor ascii chart, it doesn't do justice to the actual chart. Basically our predictions got noticeably worse the larger the Elo spread, we also had far less samples the further out you get on spread. There was a noticeable deviation in predicted / actual once past the ~120 Elo delta mark, but that damped back down again past the ~200 mark. At no point other than 0 Elo delta did we spike close to or past the 50th percentile on this graph, which would represent a true random outcome.

#113 Modo44

    Member

  • PipPipPipPipPipPipPipPipPip
  • Bad Company
  • 3,559 posts

Posted 10 April 2014 - 12:18 AM

Please understand: Your ability to correctly predict a stomp does not make a stomp any more fun for the participants.

#114 Tekadept

    Member

  • PipPipPipPipPipPipPipPip
  • Knight Errant
  • 1,290 posts
  • LocationPerth, Australia

Posted 10 April 2014 - 12:46 AM

View Postp4r4g0n, on 09 April 2014 - 11:14 PM, said:

QFT :)

How can it be trolling if its the truth? :o

View PostKarl Berg, on 09 April 2014 - 10:36 PM, said:

In theory, the first outcome you'd notice should be an increase in wait times, it takes a while for the skill loosening to meaningfully increase, as it currently follows a very non-linear curve.

Wait times have always been really variable, Sometimes you would get a match reasonably soon, <10 seconds, other times it would pan right on out there for me up to 30 odd, The quality of thematch was generally the same I personally found.

Perhaps I am misunderstanding the "implentation" of ELO, It is really hard for me to wrap my head around a "group" elo such as this, vs chess player ELO which relates to a persons skill. shouldn't ELO movement for an individual also somehow factor in the match score or another metric? As its whether that person won that match, what if they contributed 20 damage and got a matchscore of 2? and a 4man in that team carried them to victory, should they really be bumped up if in the ELO result justified that to happen? same for a loss, if somebody need to just carry a little harder in a team to win the match , but everyone else did next to nothing.

Edited by Tekadept, 10 April 2014 - 12:54 AM.


#115 Kmieciu

    Member

  • PipPipPipPipPipPipPipPipPip
  • Urban Commando
  • Urban Commando
  • 3,437 posts
  • LocationPoland

Posted 10 April 2014 - 01:26 AM

View PostModo44, on 10 April 2014 - 12:18 AM, said:

Please understand: Your ability to correctly predict a stomp does not make a stomp any more fun for the participants.

In MWO there are no respawns and the TTK (time to kill) is quite long when compared to an average FPS shooter. That's why one player feels he has no impact on the outcome of a match. When playing Quake Team Arena or Conter Strike a single pro player could "carry" his team to victory. In MWO I have never seen one pilot single-handedly destroying the opposing team.

When one team gains a numeric advantage it usually creates a domino effect and a stomp is likely to occur. Even when we used to sync-drop 12vs12 people from our own Regiment some of the games ended in a stomps...

Edited by Kmieciu, 10 April 2014 - 01:30 AM.


#116 Modo44

    Member

  • PipPipPipPipPipPipPipPipPip
  • Bad Company
  • 3,559 posts

Posted 10 April 2014 - 01:43 AM

Tell me again how 12:3 or worse stomps are all the result of the overall game design and not the matchmaker ******* up. Really, I have not heard that excuse before.

#117 Kageru Ikazuchi

    Member

  • PipPipPipPipPipPipPipPip
  • The Determined
  • The Determined
  • 1,190 posts

Posted 10 April 2014 - 02:01 AM

View PostKarl Berg, on 09 April 2014 - 10:13 PM, said:

...

Thanks for the response ... it is very good to hear that (1) the design team is thinking about bigger groups and (2) you're communicating with us. I very much appreciate the two-way conversation.

#118 Rasc4l

    Member

  • PipPipPipPipPipPip
  • Mercenary Rank 1
  • 496 posts

Posted 10 April 2014 - 03:49 AM

View PostTekadept, on 09 April 2014 - 08:11 PM, said:

Can I just say I personally think "Karl Berg" is winning the MWO forum on Developer Responses. Much more useful information has come out of this thread then the has come from the "community manager".

Can you have a word to Niko and teach him your ways?


Yes, Karl Berg appears to be full of win. My shock has turned to awe. He doesn't have to promise anything. Just the fact that he posts well argued points HERE, not twitter, NGNG or whatever. I really can't fathom why the devs seem at least partially willing to discuss their game everywhere else but the game's *actual* forums.

I do have to comment about Niko that I recently sent him a suggestion, which was exactly of the kind what I mentioned in the previous post i.e. legitimate issue, which if mentioned are basically ignored at the forums, because it's been debated to death or people on the island no longer have faith that anything will change and are unwilling to commit to a logical discussion for the nth time. He promptly responded having taken a look and said he'd add it to the list to suggest to devs. Good job!

But yeah, in general the community manager should have the information available to do his job, which is to actively engage the people at the forums and tell them what's gonna happen and alleviate concerns raised.

#119 Klappspaten

    Member

  • PipPipPipPipPipPipPipPip
  • Bad Company
  • Bad Company
  • 1,211 posts
  • LocationGermany

Posted 10 April 2014 - 04:59 AM

View PostModo44, on 10 April 2014 - 12:18 AM, said:

Please understand: Your ability to correctly predict a stomp does not make a stomp any more fun for the participants.


You forget one thing in your argument.
Sometimes theres a player with a fairly high ELO and he just has a bad day, or he just wants to have some easy fun in a troll build. Sometimes hes running maybe a spotter build.
For example: When I go out in my Raven 3L I only got 2 MLas as armament, but the whole enchilada of electronic warfare. Sometimes I got some LRM boats on my team, sometimes there are no LRMs at all. Sometimes I am able to make half the enemy team visible for the whole match, and sometimes I run in an AC40 Jager and get killed ASAP.

When we drop with 4 man, sometimes I take my spotter and the other 3 bring LRMs.
If I am able to get around the other team and make open targets visible we roll them, but if I run around a corner and stumble into 3 lights or a two lighthunters I die and make nothing visible.

PGI can not erase bad luck!
Sometimes you somp, sometimes you get stomped, thats the way it is and theres nothing PGI can do about it. All we as players can do is stop whining like little girls and try harder. Try to find out which mistakes we made and never make them again.
We can cry out for someone to make it easier for us, or we can try to make it harder for the others.

Edited by Klappspaten, 10 April 2014 - 05:02 AM.


#120 Heffay

    Rum Runner

  • PipPipPipPipPipPipPipPipPipPip
  • The Referee
  • The Referee
  • 6,458 posts
  • LocationPHX

Posted 10 April 2014 - 05:01 AM

View PostKmieciu, on 10 April 2014 - 01:26 AM, said:

In MWO there are no respawns and the TTK (time to kill) is quite long when compared to an average FPS shooter. That's why one player feels he has no impact on the outcome of a match. When playing Quake Team Arena or Conter Strike a single pro player could "carry" his team to victory. In MWO I have never seen one pilot single-handedly destroying the opposing team.

When one team gains a numeric advantage it usually creates a domino effect and a stomp is likely to occur. Even when we used to sync-drop 12vs12 people from our own Regiment some of the games ended in a stomps...


Someone posted some gaming theory math on this. For a perfectly balanced skirmish match, the average difference in the number of mechs remaining is 6.5. So a 12-6 match still falls under the realm of balanced, even if someone might consider it a stomp. Heck, due to distribution a 12-0 match could be perfectly balanced as well; the stars just didn't align up for the losing team. Conversely, a 12-11 match could be horribly skewed in Elo.

In any event, the end score isn't a very good predictor of whether it was a "good" (per Elo) match.





2 user(s) are reading this topic

0 members, 2 guests, 0 anonymous users