vandalhooch, on 22 April 2017 - 10:46 PM, said:


#201
Posted 22 April 2017 - 10:50 PM
#202
Posted 22 April 2017 - 11:34 PM
Dimento Graven, on 22 April 2017 - 09:12 PM, said:
I see someone pretending that they have done some science and then trying to use their pretend "study" to sway the opinions of others, I speak up. You don't like it, don't pretend to be doing science.
Quote
Define a "balance match." Give us a definition that can objectively be measured so that we can all look at the same data and draw the same conclusion as to what is and is not a "balanced match."
Quote
Just as I said.
Quote
Do really terrible players move up in Tier rapidly or slowly? How many matches will it take a terrible player (in your opinion) to reach Tier 1? How many matches does that player play in a month? How many months to get to Tier 1? Upon reaching Tier 1 is that player the same level of terrible (in your opinion) as they were when they started out?
Got any data to back up your claim?
Quote
Or be a new player, or play very, very infrequently, or do like some in these forums claimed to do and purposefully throw matches in order to stay in Tier 4/5.
Quote
Weight class balanced is not tonnage balanced.
Quote
Nope, nope, nope. That 80% you are quoting from Taragato does not mean what you think it means.
Taragato only included stomps (12-0 and 12-1) in his analysis. The 80% was the percentage of stomps that had the "higher level" team winning. Note, 20% of those stomps were STOMPS BY THE "WEAKER TEAM." Not just that the "weaker team" won, they STOMPED the stronger team. Taragato's data did not show that the majority of matches have such a discrepancy because he did not collect data on every match! How many matches do "weaker teams" according to Taragato end up winning the match? We have no idea because he didn't collect that data.
Quote
Define balanced in an objective way.
Quote
That was part of it yes, didn't think it merited mentioning because that's obvious. No one wants brand new players being butt ***** by people who have been playing for years. While it might be fun to club seals, it's no fun being the seal. BUT ALSO, it was an attempt at producing a mechanism that worked better than elo, and would create more interesting and fun matches by ensuring that both sides had as close to equal skill as reasonably possible.
That may be what you imagined the new matchmaker was supposed to do but you can't program any matchmaker to balance player skills. If you could, you'd make a fortune from the gaming industry and Vegas would definitely be interested in using your system for their sports betting. You can create a system that reduces the general level of imbalance by using proxy metrics for "skill" but there is no goal of "interesting" and "fun" matches because those aren't objective things that can be calculated.
Quote
Please don't try to lecture me about how science works. Do you even know what I do for a living?
If your observations are biased, and your data is insufficient then you end up not understanding anything beyond what you wanted to be true from the start.
Quote
The limited sample can't support anything. It's too limited. It is the very definition of observation bias at work. If the data seems to confirm what you originally believed before the analysis then GOOD SCIENCE is to be even more skeptical of the data. Humans are very, very good at lying to themselves without being consciously aware of it.
Quote
Which is why I said that the matchmaker builds matches as quickly as possible WHILE LIMITING THE MIXING OF EXPERIENCED AND INEXPERIENCED PILOTS as much as possible.
Quote
I'm supporting these sentences:
"In fact matchmaker doesn't assemble the equal teams. It makes teams to be unequal. "
What size tin foil hat do you wear?
Quote
And if you were convinced by those two different "analysis" then you deserve to be ripped off by every huckster who comes along. Your critical thinking skills are nearly non-existent.
Quote
"The one team is determined to win, the other - to lose."
Poorly worded, perhaps a bit hyperbolic if the syntax was intended, but it doesn't invalidate the fact that it really does appear from the data,
No it doesn't. And no amount of his and your repeating that claim will ever make it true. That's not how statistics and science work.
Quote
Very definition of biased observation.
Quote
Got a metric that does it better? We're all ears!
Quote
A zero sum system is the root of ELO systems. You said that didn't work when it was tried but here you are arguing in favor of it.
Quote
So? If the goal is to separate experienced from inexperienced then it works just fine.
Quote
Given player population sizes during some times of the day, just how long are you willing to wait for your match? Is everyone in agreement with your opinion? Why or why not?
Quote
So, you acknowledge that the root problem is player pool size but think that a "better" matchmaker will overcome that root problem?
Quote
"...so it should be using other data, W/L, MS, etc. all seem like a good place to start."
PGI has this information already, surely it can make sure that the average W/L, match score, and damage per match amongst the two sides is closer to even.
Those metrics are already incorporated in PSR. You want to create a more complex algorithm that attempts to balance multiple metrics simultaneously between two teams? Why? Why would you create such a grossly inefficient system?
In your imaginary system, how does matchmaker balance match score between the two teams at the same time it's trying to balance damage per match between those teams? What does it do when those numbers are not strongly correlated within each player?
Sheer lunacy!
Quote
I'll bet that your solution ends up combining those different metrics into one overall summary score for each player and that you end up using that summary to create the teams.
Quote
Since we're fairly certain we're not getting balanced matches now, it'd sure be nice to at least attempt it, no?
No. We are NOT "fairly certain we're not getting balanced matches now." That's my entire point. Neither the OP nor Taragato have demonstrated anything of the kind. All we have is the biased opinions of people. Nothing systematic or objective about any of it.
Quote
And you're here to do what . . . knit puppy mittens?
Quote
<yawn>
Those who don't understand statistics are the ones who are fooled by the liars. Guess which category you fit into!
Quote
I mentioned W/L ratio, match score, and damage per match.
Yep. My bad. Sorry for the misquote.
Quote
How should they be combined? What about new/smurf accounts that have very few matches and thus might have extreme values? Is a new player that got lucky in his first match going to be placed in with the best of the best in his second match? How will you prevent that?
Quote
Group queue vs. solo?
Quote
Yes. Do you think there are enough of them to fill out a complete match every time one of them hits the Quick Play button at any time of the day? If you do manage to get them into their own matches with others like them, will their metrics remain the same high level over time? As their metrics drop do they come join the rest of us plebes down below?
Isn't that what we already have now?
Quote
Why? It will produce exactly what we have now.
Small player pool can not be overcome by a more elaborate matchmaker.
#203
Posted 23 April 2017 - 04:18 AM
vandalhooch, on 22 April 2017 - 08:08 PM, said:
Again, I didn't know what numbers each individual match would poop out until long after I decided to save the screenshot and add it to the data. It's not like I analysed each match, went "hrmmm, this one won't support my narrative, I'll throw it out."
Quote
1 - Sample size of 71 is likely not large enough for a an alpha of 5% given the inherent variance of the data.
Given how crude the whole thing was, I'd say 5% is pretty narrow, and shows that it might be worth looking into.
Quote
True. But in order to do this, I'd have to probably use OCR and collect thousands of matches. That's getting to be a bit much.
Quote
4 - What is the baseline rate of difference between teams for all matches? Are stomps common? Are they rare? How well does your model predict the rate of stomps?
Again, in order to go this in-depth, I'd have to up the scale and depth by an order of magnitude. Though, I already did note that this was solo queue only, so at least we can tick one off the list.
Quote
But it has absolutely no relevance to the question of if the matchmaker is failing to generally make evenly matched teams. For that, you need to collect the results of thousands of matches so that you can run a proper ANOVA to account for all the factors that might affect the outcome of any particular match.
I agree, but that's PGI's job. My goal was to show that there is enough evidence to warrant a proper investigation. The outcome of a stomp match being predicted correctly 75% to 80% of the time just by the players' stats is fishy, that's the point i was trying to make. And I showed in my OP how matches could be better constructed (I checked a few actual matches in my data to confirm that my suggestion about swapping players could be viable before just throwing the idea out there).
Quote
I mean, I agree. WLR, MS, and KDR are better measures of a player than the experience bar we have now that is PSR.
If anything, PSR should be based on
- Your average matchscore, primarily
- adjusted by what weight classes you play, proportionally
- adjusted by the number of matches you played, with a cap (so that you can't, or at least are unlikely to, be thrown into the Tier 1 sharktank unless you've played a certain number of matches, like 100, or 500, or whatever turns out to be an appropriate limit)
- WLR (and KDR) (though honestly, I haven't had the most convincing results by using these in player rating algorithms, but if done properly they should be effective at reconciling the difference between high MS pug-star heroes, and low MS but high-efficiency group-queue winners.
Quote
As in, previously, the measured variables agreed with the result of the match 75%-80% of the time, a pretty danged strong correlation for something as sophisticated as MWO matches. After removing the 12-2 data, those numbers only went down by like 2%, 2%, and 4%, respectively, shown on the table. For having removed nearly 40% of the data, and the result didn't even change much... I think that's pretty telling that there is an underlying pattern.
Quote
I analysed them because I didn't know what would happen, and I was looking anywhere I could for patterns/predictability. They were sorta of "extras", or "wildcards" that could be worthless, or could much to my surprise show a very strong correlation. The results of course turned out pretty inconclusive, but I think it's a shame to not share them, because I did the work and I wanted people to see what I found, whether it was useful or not. Hey, if I decided to not show them in my post, that would be cherry picking wouldn't it? =P
Quote
But again, it does show that cases of stomp matches the result can be predicted up to 80% by just looking at the players before they drop. You're right, I didn't show that A STOMP could be predicted with any certainty, I only showed that when stomps DO occur, they show some signs of being predetermined. At least, to a large enough extent that I feel it supports the notion that PGI should really take a fresh look at their matchmaker and actually check if they are matching players optimally. ie., is PSR alone a good enough metric? I don't think so.
#204
Posted 23 April 2017 - 04:45 AM
Dimento Graven, on 22 April 2017 - 09:12 PM, said:
"In fact matchmaker doesn't assemble the equal teams. It makes teams to be unequal. "
This from experience and two different people doing analysis independently, appears to be proven.
You have to be very careful how you word these things though, and because this is worded very cynically I strongly disagree with it. It seems to me he is saying "the matchmaker intends to make imbalanced matches", or that "the matchmaker tries to force results by assembling stronger teams against weaker teams."
I don't believe this to be true. I believe the matchmaker does the best it can at creating matches where each side has equal opportunity to win. Any evidence we gather that shows unnecessary imbalances doesn't necessarily mean the matchmaker is trying to create unfair matches, but it certainly shows that the matchmaker is failing to create fair matches. This is two completely different spins on the same particle here.

#205
Posted 23 April 2017 - 04:48 AM
**Pats self on back**
#206
Posted 23 April 2017 - 05:24 AM
vandalhooch, on 22 April 2017 - 11:34 PM, said:
[...]
Nope, nope, nope. That 80% you are quoting from Taragato does not mean what you think it means.
Taragato only included stomps (12-0 and 12-1) in his analysis. The 80% was the percentage of stomps that had the "higher level" team winning. Note, 20% of those stomps were STOMPS BY THE "WEAKER TEAM." Not just that the "weaker team" won, they STOMPED the stronger team. Taragato's data did not show that the majority of matches have such a discrepancy because he did not collect data on every match! How many matches do "weaker teams" according to Taragato end up winning the match? We have no idea because he didn't collect that data.
[...]
Please don't try to lecture me about how science works. Do you even know what I do for a living?
If your observations are biased, and your data is insufficient then you end up not understanding anything beyond what you wanted to be true from the start.
The limited sample can't support anything. It's too limited. It is the very definition of observation bias at work. If the data seems to confirm what you originally believed before the analysis then GOOD SCIENCE is to be even more skeptical of the data. Humans are very, very good at lying to themselves without being consciously aware of it.
I'm just gonna leave this here...
You very clearly know a lot more about this than any of us. I'm not being facetious... I've actually already learned a few things from you and I'm not going to pretend I didn't. I'm not a statistician, or scientist, I'm just a dude with copious spare time and big crush on curiousity - I have a lot to learn still, and I accept that I will make mistakes and can unintentionally misrepresent data.
Now, I did limit the scope of my "study" for practical purposes. I entered my data manually, and would need a working OCR to expand the scope enough to even attempt to assuage the concerns you've raised. But even if I did that, I could still fall victim to more scientific inadequacies due to my growing but decidedly limited knowledge.
Now, I've already spent maybe... two hours? ... just reading and replying to this thread, and I suspect perhaps you might have as well. It would be absolutely wonderful if somebody like you, with superior knowledge, experience, and ideals... spent this sort of time doing this kinda of work. Showing how it's done properly, and what ACTUAL objective conclusions can be definitively drawn. I'd love that! I'd really like to see somebody one-up me. But nobody wants to spend the time! I'm fallible, duh! I wish more people cared enough to actually put in this kind of work, rather than just bickering about hypotheticals on the forums like so many do.
What I mean to say is... the work I've done here is pretty much the best we have so far. And I know it's not great work, it's amateur, and you've pointed out flaws quite clearly. But who wants to step up and do more, better? Or the golden question... why should WE have to, when it's PGI's job? At the end of the day, I'll be happy if anybody shows conclusive enough evidence to merit PGI's attention, and prod them into investigating themselves, and addressing our concerns as a community that we feel the matchmaker could do a better job with the hand that it is dealt.
Sorry, this might have come off as a bit like "I put in the work even if it's shoddy, and you didn't put in any, therefore I'm above you". I realise that... I apologise, I don't intend that by any means. But I'm just not sure where to go from here. You're being very critical and argumentative, when you have an opportunity to be critical and contributive. If none of this research is valid, for various reasons, what *can* we do? Or is everything you have in mind beyond that which is reasonable practical for us, the playerbase, and thus futile? If we have a feeling that the matchmaker could be better, how should *we* go about showing to PGI that our concerns have merit?
Edited by Tarogato, 23 April 2017 - 05:33 AM.
#207
Posted 23 April 2017 - 05:27 AM
Tarogato, on 23 April 2017 - 04:45 AM, said:
"Worded vey cynically"? Ok, the next time I'll add some photos of cats and doggies to make it less disturbing and appropriate for minors.
Anyway, I like how my original post gets the features of sacred text and promotes the struggles of interpretations.
#208
Posted 23 April 2017 - 05:33 AM
Tarogato, on 23 April 2017 - 05:24 AM, said:
It would be absolutely wonderful if somebody like you, with superior knowledge, experience, and ideals... spent this sort of time doing this kinda of work. Showing how it's done properly, and what ACTUAL objective conclusions can be definitively drawn. I
What I mean to say is... the work I've done here is pretty much the best we have so far.
OMG, this is such a great mix of blunt flattery and uncovered narcissism that it has the value of its own.
Glad that my topic provided you an opportunity to meet and fruitflully exchange ideas.
Edited by drunkblackstar, 23 April 2017 - 05:35 AM.
#209
Posted 23 April 2017 - 05:52 AM
drunkblackstar, on 23 April 2017 - 05:33 AM, said:
lol, perhaps taken sliiiiiiiightly out of context and certainly poorly quoted, but not entirely false I guess. =P
=/
#210
Posted 23 April 2017 - 06:03 AM
drunkblackstar, on 23 April 2017 - 05:33 AM, said:
Glad that my topic provided you an opportunity to meet and fruitfully exchange ideas.
your serious?
yea vandalhooch. That confirmation bias is strong on these forums, with people patting each other on the back when they get **** wrong. makes them wan't to double down. the Backfire effect.
#212
Posted 23 April 2017 - 06:22 AM
drunkblackstar, on 23 April 2017 - 06:17 AM, said:
It's cool, I see you gotta troll me, but you got destroyed in this thread. As well as your bad attempt at gathering data to prove the MM bIas/unfair.
Then the grand argument that lasted pages on end, because a few people don't know how rigorousness science works.
oh yea, and before you go around saying to yourself, There is no evidence on this forum nor do I ever search for confirmation bias at all. I go where the evidence leads. Of course you don't know me outside the forums so you don't know the academic field I am in.
Shifty McSwift, on 23 April 2017 - 04:48 AM, said:
**Pats self on back**
careful, now most of the thread is a waste. The idea was refuted around page 1-2. Some people didn't want to give up.
Edited by BLOOD WOLF, 23 April 2017 - 06:30 AM.
#213
Posted 23 April 2017 - 06:40 AM
BLOOD WOLF, on 23 April 2017 - 06:22 AM, said:
I can understand your ressentiment, that is because you were with one of the most negative abusive people on team speak.
1) Your statement is simply not true. Up to date my OP gathered 37 likes. A lot of people told that they agree with me and expressed their support. I appreciate that.
2) In fact, the followed discussion was quite useful . I would say that it affected my opinion. Previously I was almost sure that MM deliberatly fixes results. Now I'm not so positive. I understand that there is a possibility that it is PSR system flaw. I'd like to thank guys who provided constructive thoughts.
3) What I didn't like is the simple posts like "you are wrong", "it's BS" etc. If you have input to make, something to say, say it. If not - better move along.
Edited by drunkblackstar, 23 April 2017 - 06:48 AM.
#214
Posted 23 April 2017 - 06:55 AM
drunkblackstar, on 23 April 2017 - 06:40 AM, said:
1) Your statement is simply not true. Up to date my OP gathered 37 likes. A lot of people told that they agree with me and expressed their support. I appreciate that.
2) In fact, the followed discussion was quite useful . I would say that it affected my opinion. Previously I was almost sure that MM deliberatly fixes results. Now I'm not so positive. I understand that there is a possibility that it is PSR system flaw. I'd like to thank guys who provided constructive thoughts.
3) What I didn't like is the simple posts like "you are wrong", "it's BS" etc. If you have input to make, something to say, say it. If not - better move along.
1.The number of likes doesn't mean anything. Democracy doesn't overturn empirical data.
2) I am glad your a little more advanced than people like Carl, and are capable of changing opinion in lieu of the data. However, the possibility could be a number of factors. People on this forum seem to jump with the easiest conclusions and even worse Cling to their in groups on certain issues, and they never get out of that bubble of same concluding thoughts.
3) Sorry but its how I post. If your wrong empirically I am going to say so. I will also go to explain why or give my take. It's also entirely possible I say that and I could be wrong. I have been a few times on this forum, just like everybody else. That's what discussion is for, to root out the truth. Depending on a persons disposition towards me I respond in kind. Like how I mentioned confirmation bias and you felt to post a meme that presumed that I am strong with confirmation bias. Sorry to say it but that makes your number 3 a hypocritical stance. again, not making it personal
Edited by BLOOD WOLF, 23 April 2017 - 06:57 AM.
#215
Posted 23 April 2017 - 07:08 AM
Crytek (
Advanced Modular AI System
Realistically rendered and animated characters require state-of-the-art AI systems to intelligently respond to the game environment and maintain the illusion of realism. CryENGINE 3 features powerful, scalable, and flexible AI
technology to handle character behaviors with modular sensory systems, such as sight and hearing, and fully support the complex requirements of the character locomotion system.)
When PGI learned they could manipulate players win/loss rate with the Crytek Server AI which tries to maintain a 1.0 for everyone they started to down a dark path in this games history. No longer was the determining factor skill based as the Server AI will attempt to limit your offensive output so you seem to hit a target but no damage is registered by the client to the server or
the other way around where you receive more damage than you should so you die to balance the 1.0 equation.
You can download the Crytek SDK like I have and basically make a sandbox MWO clone play with the PVE and PVP Server AI setting play a few games with some friends and soon you start to understand why MWO is not player skill based at all.
http://www.crytek.co...ngine3/overview
Edited by KingCobra, 23 April 2017 - 07:08 AM.
#216
Posted 23 April 2017 - 07:09 AM
You only have to be put into multiple matches to realise that the Match 'maker' isn't doing a decent job. As a T5 player I've just been put into a match with many T2 players - not fun.
Is this a function of a small player base? Who knows. The only way we would know is if PGI release the algorithm they use to do this. Shame it isn't an Open Source bit of code.
#217
Posted 23 April 2017 - 07:13 AM
drunkblackstar, on 23 April 2017 - 06:40 AM, said:
To be fair, I didn't offer my opinion either way, and I just realised that now.
I also think your study lacks a proper sample size for conclusions. 12 matches, is ... quite bluntly... pathetic. I'd like to see at least 100 matches before I'm inclined to believe something, and closer to 1000 if you want to begin to prove it.
It's a good start, but you need a lot more before it will carry any weight at all.
#218
Posted 23 April 2017 - 07:46 AM
Tarogato, on 23 April 2017 - 07:13 AM, said:
I knew that the scientific standards on online gaming forums considered to be one of the highest. I completly agree with you. 12 is not enough. Where are we? It's not "Nature" or "Science" ! It's MWO forum for God sake, I had to be precise, I recognize that.
I also knew that there would be few respecful scientific peers, who specialize in the field of statistics, who would point out that my sample is quite small (turned out about 10500+ gentelmen) . That's why I made special clause in my original post about it.
Thank you for your opinion!
#219
Posted 23 April 2017 - 08:39 AM
KingCobra, on 23 April 2017 - 07:08 AM, said:
When PGI learned they could manipulate players win/loss rate with the Crytek Server AI which tries to maintain a 1.0 for everyone they started to down a dark path in this games history. No longer was the determining factor skill based as the Server AI will attempt to limit your offensive output so you seem to hit a target but no damage is registered by the client to the server or
the other way around where you receive more damage than you should so you die to balance the 1.0 equation.
http://www.crytek.co...ngine3/overview
yea.........hmmm...........no
Edited by BLOOD WOLF, 23 April 2017 - 08:39 AM.
#220
Posted 23 April 2017 - 09:59 AM
Tarogato, on 23 April 2017 - 04:18 AM, said:
I get that, but it still makes your inclusion of 12-2's biased, whether or not you were consciously aware of any bias or not.
Quote
I don't disagree that it would be worth looking at in a more systematic way. The 5% alpha is just the typically acceptable error rate for most sociological and psychological research. Particle physics, like at the LHC, has a much, much more stringent acceptable error rate.
Quote
Again, in order to go this in-depth, I'd have to up the scale and depth by an order of magnitude. Though, I already did note that this was solo queue only, so at least we can tick one off the list.
Yep. There's a reason why databases and spreadsheets are the go to tools of modern science.
Quote
Except that's not what you measured. In the case of a stomp you actually showed that the weaker team STOMPED THE STRONGER TEAM 20% of the time. That high of a value leads me to believe that your metric is not nearly as good at identifying stronger vs. weaker teams. None of that has anything to do with the likelihood of a weaker or stronger team WINNING the match.
Quote
Except you never collected data on win/loss rates of your so-called strong vs weak teams. You only recorded the results of stomps, not every match.
Quote
Since we don't know exactly how PSR is calculated, how can you possibly know that WLR, MS and KDR aren't already included in PSR?
BTW: Those three metrics are not independent of one another. Wins and losses as well as kills and deaths are part of match score calculations. That is going to be problematic for your new player skill metric. If a player has a high KDR, then they will by default have a higher average match score.
Quote
- Your average matchscore, primarily
- adjusted by what weight classes you play, proportionally
- adjusted by the number of matches you played, with a cap (so that you can't, or at least are unlikely to, be thrown into the Tier 1 sharktank unless you've played a certain number of matches, like 100, or 500, or whatever turns out to be an appropriate limit)
- WLR (and KDR) (though honestly, I haven't had the most convincing results by using these in player rating algorithms, but if done properly they should be effective at reconciling the difference between high MS pug-star heroes, and low MS but high-efficiency group-queue winners.
You just described the current PSR system, with the exception of the weight class weighting. I'm not sure how you are going to incorporate that into a single value for matchmaking. What happens when that pilot decides to drop in a weight class they rarely use? Does their PSR go up? Down? How much?
Quote
But you haven't detected a pattern in the match maker. You detected a pattern that if a stomp happens, then the stronger team is usually the stomper and only 20% of the time the stompee. That's hardly an earth-shattering revelation.
You didn't show that stomps happen more often than they should if teams were "balanced."
You didn't show that matches are more often unbalanced than balanced from match to match.
You didn't actually address anything that the OP of this thread claimed, which is why I pointed out to him that citing your analysis was completely irrelevant.
Quote
I definitely appreciate the hard work you put into gathering and organizing the data. I think it was a very worthwhile effort.
However, we still have to understand what it is you actually found versus what you hoped to find out.
Your technique could be used to answer the questions most people in this thread are actually interested in but as you noted above it would require massive amounts of work on your part because we are having to sift through different data sources instead of having direct access to the database itself.
Quote
Better team wins match . . . news at eleven.
Quote
Without a measurement of how often teams are unbalanced using your metrics and a strong correlation with win rates favoring the stronger team, I don't think your data really supports a claim that PSR is a bad metric. It could be as you say but you still don't have the appropriate data to back that claim up.
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users